TARQL extension for data cubes

 

Download (v.1.0) Get the latest version.

The parameters to configure the transformation process, such as the TARQL query and the location of the source data can be provided by the user. The module takes care of the execution and managing the transformation process. The output triples are written to the Information Workbench SPARQL endpoint. Therefore, the resulting data can be queried together with the data provided by different modules.

 

tarql-screen

To help the user with the conversion to RDF data cube, the module provides a sample TARQL query, which can output observations description according to the Data Cube vocabulary. The query can be used as a template.

Role

This component realizes data conversion to RDF from legacy tabular data (such as CSV/TSV files).

 

How it works

TARQL Extension for data cubes is a tool for converting CSV files to RDF accordingly to SPARQL 1.1 syntax. The data cubes are generated based on the provided, easy to modify, query templates. The extension is integrated with the Information Workbench as a standard data provider. The interface delivered enables user to specify the basic information about the provider along with the location of the CSV file, polling intervals and to modify the cube mapping query via provided SPARQL editor. The interface includes information on status and the time that was needed to transform the data and enables to browse the output triples generated.

The TARQL extension can be stacked with other data components for instance the output RDF can be visualised with i.e. the OpenCube Map View.

 

Functionality

The component is built on top of Apache ARQ2. The OpenCube TARQL component includes the new release of TARQL. It brings several improvements, such as: streaming capabilities, multiple query patterns in one mapping file, convenient functions for typical mapping activities, validation rules included in mapping file and increased flexibility (dealing with CSV variants like TSV).

The OpenCube Toolkit user is able to create Cube directly from the input files and store the output in the SPARQL endpoint. The OpenCube TARQL extension offers the following options:

  • Stream CSV processing
  • Use of column headers as variable names
  • Translate the CSV imported table into RDF by using a prepared mapping file (SPARQL construct schema)
  • Test the mapping file (shows only the CONSTRUCT template, variable names, and a few input rows)