Ctrl + All Tooling Overview

This note describes the tooling used by the Ctrl + All. Computing the Social project. The tooling is used for:

and provides:

Knowledge Graph

The Ctrl + All Knowledge Graph consists of three types of data:

  1. Binary artifacts: Examples include NIPS files
  2. RDF: Structured and semantic data about binary artifacts, data converted from binary artifacts, annotations and metadata.
  3. Notes: Plain-text notes as well as other supporting resources (such as PDFs)

Importing Data into the Knowledge Graph

Data can be imported into the knowledge graph using the ctrlall import command:

$ python -m ctrlall import ctrlall.ttl vocab/*.ttl
urn:eris:BIAFQJFTQIAGSPBJ7XMRNHOBWA42UUAS5CCHU24U6FF43C5TADJSHDAN4KTTAPONQX62ES4XKWC6PNW5KXCBFLRIETMDS6KKWOVEN564GQ
file:///vocab/dcat__vocab.ttl
file:///vocab/discovery__vocab.ttl
file:///vocab/hes-bmf__vocab.ttl
file:///vocab/nipsv__vocab.ttl
file:///vocab/prov__vocab.ttl
file:///vocab/qb__vocab.ttl

This reads the content of the ctrlall.ttl file (a RDF/Turtle file) as well as the RDF vocabularies in the vocab directory and imports the content into the knowledge graph.

The initial printed URN is the identifier of the content in the ctrlall.ttl file. You can view the content by following the following link:

urn:eris:BIAFQJFTQIAGSPBJ7XMRNHOBWA42UUAS5CCHU24U6FF43C5TADJSHDAN4KTTAPONQX62ES4XKWC6PNW5KXCBFLRIETMDS6KKWOVEN564GQ

or by entering the URN into the search bar in the top right.

References to other files in the repository using file: URIs are resolved and are also imported. For example the ctrlall.ttl file contains a reference to file:data/HES/HES.HAMLA.67.ttl. When importing the ctrlall.ttl file the fiile data/HES.HES.HAMLA.67.ttl will also be imported.

Imported data is content-addressed, meaning that the identifiers are computed from the content of the data itself. This allows deterministic identifiers without any centralized coordination. The computed identifier is printed when importing data.

Data File Descriptions

One specific type of RDF content are descriptions of data files. They are stored as plain-text RDF/Turtle files in the data subdirectory of the ~ctrlall~ repository. For example HES.HAMLA.68.ttl:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix nips: <http://computingthesocial.net/ns/nips/> .

<> a nips:DataFile ;
    dcterms:title "HES.HAMLA.68.NIPS" ;
    dcterms:format "NIPS" ;
    dcterms:publisher "NARA" ;
    rdf:value <file:HES.HAMLA.68.NIPS> .

The previous invocation of the ~import~ command imported these statements and gave them the content-addressed identifier: urn:eris:BIANYKNCSDX2ZHV6DG6NYFXFXUQ7PMVUZLYLR26BBFKBNLA65CEJJ6CK3C6PKQKXYZ2H5LEAGRJZXOXSA3GYX4QLMOQHZ3XTDLP5USEXGQ.

It describes the binary file HES.HAMLA.68.NIPS with some metadata using predicates from Dublin Core.

The rdf:value predicate is used link the metadata with the actual binary content, which is also assigned a content-addressed identifier when importing.

Conversion to RDF

The tooling is able to convert NIPS Data Files as well as HES Basic Master Files to an Resource Description Framework (RDF) representation. See Conversion of NIPS Data Files to RDF and Conversion of HES Basic Master File to RDF.

Notes

Notes are named using the Denote file-naming scheme as Markdown documents:

DATE--TITLE__KEYWORDS.md

The DATE serves as an unique identifier for notes.

Technical documents and other assets that are not Markdown files follow the same file-naming scheme.

Web-based user-interface

SPARQL endpoint

Related Projects and Inspiration