GraphDB

4. GraphDB

We use GraphDB to store majority of the data. GraphDB is a triplestore database supports RDF and SPARQL, the detail is here The servers connects to GraphDB repository cmmp via localhost:7200, and cmmpTest for test environment

4.1 GraphDB Dashboard

GraphDB provides user interface dashboard for manage database repository, the related documentation is here.

4.2 Loader/Settings

4.2.1 Preload NPM packages

First preloading the related NPM packages and utils setting. The extra packages required are graphdb, fs, node-fetch and form-data

const {ServerClient, ServerClientConfig} = require('graphdb').server;
const {RepositoryClientConfig, RDFRepositoryClient} = require('graphdb').repository;
const {SparqlJsonResultParser, JsonLDParser} = require('graphdb').parser;
const {RDFMimeType} = require('graphdb').http;
const {graphdb} = require('../config');
const fs = require('fs');
const fetch = require('node-fetch');
const FormData = require('form-data');
const {sleep} = require('../utils');

4.2.2 GraphDB Repository

Then you can create GraphDB Repository by the following and import RDF4J configuration

async function createRepository() {
const form = new FormData();
form.append('config', fs.createReadStream(__dirname + '/config.ttl'));
form.append('location', '');
const res = await fetch(graphdb.addr + '/rest/repositories', {
method: 'POST',
body: form,
headers: form.getHeaders()
});
if (res.status === 500) {
throw Error('Fails to create repository: ' + (await res.json()).message);
}
}

Also, you need to have a function to get the repository, which needed for load function later on

async function getRepository() {
while (!repository) {
await sleep(100);
}
return repository;
}

4.2.3 Load Data

Then, we can load the RDF data to the graphDB.

Note: overwrite is set to True when you want to reset the GraphDB data, by default, we always reset data when run the test in test environment.

async function loadInitialData(file, overwrite = !!process.env.test) {
const contentType = RDFMimeType.TURTLE;
return new Promise((resolve, reject) => {
fs.readFile(file, (err, stream) => {
repository[overwrite ? 'overwrite' : 'upload'](stream, contentType, null, null)
.then(() => resolve())
.catch(reason => reject(reason));
});
});
}

4.2.4 Load GraphDB

After the preparation of packages, function of configurations, repositories and load data, we can load the whole GraphDB module in the loader

let dbClient, repository;
async function load() {
const DBName = process.env.test ? "cmmpTest" : "cmmp"
const serverConfig = new ServerClientConfig(graphdb.addr, 0, {
'Accept': RDFMimeType.SPARQL_RESULTS_JSON
});
dbClient = new ServerClient(serverConfig);
const ids = await dbClient.getRepositoryIDs();
if (!ids.includes(DBName)) {
await createRepository();
console.log(`Repository \`${DBName}\` created.`)
}
const readTimeout = 30000;
const writeTimeout = 30000;
repository = await dbClient.getRepository(DBName,
new RepositoryClientConfig([graphdb.addr + `/repositories/${DBName}`], {}, '', readTimeout, writeTimeout));
// using https://github.com/rubensworks/sparqljson-parse.js
repository.registerParser(new SparqlJsonResultParser());
repository.registerParser(new JsonLDParser());
console.log(`GraphDB ${DBName} connected.`);
await loadInitialData(__dirname + '/../ontologies/creative_mixed-use_ontology.ttl');
//Namespace
await repository.saveNamespace('owl', 'http://www.w3.org/2002/07/owl#');
await repository.saveNamespace('rdf', 'http://www.w3.org/1999/02/22-rdf-syntax-ns#');
await repository.saveNamespace('xml', 'http://www.w3.org/XML/1998/namespace');
await repository.saveNamespace('xsd', 'http://www.w3.org/2001/XMLSchema#');
await repository.saveNamespace('rdfs', 'http://www.w3.org/2000/01/rdf-schema#');
console.log('GraphDB loaded.');
}
module.exports = {getRepository, load};

Note: The namespace is for convenience to build up SPARQL query without duplication.

4.3 SPARQL, RDF, and Ontologies

4.3.1 RDF

Resource Description Framework. You can read details here

4.3.2 SPARQL

SPARQL is a powerful query language. You can read details here

4.3.3 Ontology

The description of ontology can refer to here

4.4 GraphDB Utils

We develop a tool by an object oriented way which can create GraphDB model and manipulate more conveniently. It create and execute the SPARQL query internally.

4.4.1 Create a Model

You can create a GraphDBModel from the functionality of utils/graphDB/graphDBSchema.js

const PrimaryContactModel = GraphDBModel({
first_name: String,
last_name: String,
position: String,
telephone: String,
telephone_ext: String,
email: String,
}, {
rdfTypes: [Types.NamedIndividual, ':primary_contact_test'],
name: 'primary_contact_test'
});

4.4.2 Add new data in GraphDB

const newPrimaryContact = await PrimaryContactModel(
first_name: "Test",
last_name: "User",
position: "Software Developer",
telephone: "1234567890",
telephone_ext: "123",
email: "testuser@example.com",
})

Then you can get the SPARQL Query in the following:

PREFIX : <http://cmmp#>
INSERT DATA {
:primary_contact_1 rdf:type owl:NamedIndividual, :primary_contact;
:has_first_name "Test";
:has_last_name "User";
:has_position "Software Developer";
:has_email "testuser@example.com";
:has_telephone "1234567890";
:has_telephone_ext "123".
}

4.4.3 Search Data

You can the function with related SPARQL query in the following, respectively

Find all of the primary contact

const all_primary_contact = PrimaryContactModel.find({})

SPARQL:

PREFIX : <http://cmmp#>
CONSTRUCT {
?s ?p0 ?o0
} WHERE {
?s ?p0 ?o0.
?s rdf:type owl:NamedIndividual, :primary_contact.
}

Find all of the primary contact matches the first name Test

const primary_contacts_with_first_name = PrimaryContactModel.find({first_name: "Test"})

SPARQL:

PREFIX : <http://cmmp#>
CONSTRUCT {
?s ?p0 ?o0
} WHERE {
?s ?p0 ?o0.
?s rdf:type owl:NamedIndividual, :primary_contact.
?s :has_first_name ?o0_0.
FILTER(?o0_0 = "Test")
}

Find one primary contact matches the email testuser@example.com

const primary_contact_with_email = PrimaryContactModel.find({first_name: "Test", email: "testuser@example.com"})

SPARQL:

PREFIX : <http://cmmp#>
CONSTRUCT {
?s ?p0 ?o0
} WHERE {
?s ?p0 ?o0.
?s rdf:type owl:NamedIndividual, :primary_contact.
?s :has_first_name ?o0_0.
FILTER(?o0_0 = "Test")
?s :has_email ?o0_3.
FILTER(?o0_3 = "testuser@example.com")
}

4.4.4 Update/Delete Data

// WIP......