Conversion of HES Basic Master File to RDF

The Ctrl + All Tooling is capable of converting HES Basic Master Files to an Resource Description Framework (RDF) representation. This note describes the conversion.

Overview

HES Basic Master Files are described in RDF using the RDF Data Cube Vocabulary (QB). A dedicated vocabulary - HES-BMF - is used to describe the HES Basic Master File and the various fields of the many record types.

HES Basic Master Files consist of 22 different record types. For each record type the conversion defines a QB Data Set. Additionally a QB Data Set is defined for all records in the Basic Master File. See for example the data set of all records contained in the HES Basic Master File HES.Y6907.Y7302.

Implementation

The conversion is implemented in two steps (similar to the Conversion of NIPS Data Files to RDF):

  1. Parsing of HES Basic Master File Records to Python values. This is implemented in the module hes.basic_master_file.
  2. Serialization of the Python representation to RDF. This is implemented in the module ctrlall.cmds.bmf2rdf.

Open Issues/Questions

Question Response Record

The question response records (Hamlet Monthly Hamlet Quarterly, Village Monthly and Village Quarterly) contains encoded question responses. For example:

A44B00000009C0002D0000000Z999999999

How can this be decoded and mapped to questions from the QTAB Questions?

QTAB Question Description Records - VARIABLE QUESTION NOT USED THIS CYCLE.

Some QTAB Question Description Records hold a sigle question text group (the structure that stores question and response text):

71017105VARIABLE QUESTION NOT USED THIS CYCLE.

According to the documentation (Basic Master File Description ) this does not seem valid. Currently our code parses this incorrectly as the 71st response for no question:

QTABQuestionDescriptionRecord(length=87,
                              os_control=0,
                              usid='         ',
                              record_code='3029',
                              record_start_date='7101',
                              record_stop_date='7301',
                              activity_code='0',
                              question_code='HMZ03',
                              level_code='HM',
                              topic_group_code='Z',
                              question_number=3,
                              maximum_response=0,
                              question_text_group_count=1,
                              question_text='',
                              responses={71: '017105VARIABLE QUESTION NOT USED '
                                             'THIS CYCLE.'})

Questions: