R statistical analysis module
This component aims at giving the users the ability to apply statistical analysis methods to data represented as RDF data cubes.
How it works
Although the component is implemented as an IWB app, it requires an instance of R to be exposed as a web service. This is achieved by using the Rserve module, which is installed as an R package and allows external connection to the R environment via HTTP. Rserve also provides a Java client API, which provides a basic support for executing R scripts and manipulating with basic R data types.
We extended this API to include the following necessary capabilities:
- Conversion of data types between Sesame RDF API and Rserve R API.
- Ability to operate with R data frame objects.
- Conversion of SPARQL SELECT query result sets into R data frame objects.
- The IWB components are realized on top of this extended API. The results produced by an R script are utilised in the following way:
- The charts produced in R are converted into PNG images and passed to IWB as binary objects. These are then shown on wiki pages as data URIs.
- The data frames produced as results of R scripts are saved into temporary H2 relational data tables. These data tables are then converted into RDF triples using R2RML mappings.
This component is developed as an extension of the Information Workbench platform to enable the system to perform various statistical analysis tasks. Given the rich capabilities of R as an environment for statistical analysis, it was decided to make these capabilities available for processing data retrieved from an RDF data store. In particular, with this component it is possible to retrieve relevant data from the repository, pass them to the R environment (converting data types to the corresponding R ones), execute R scripts processing these data, and use the results in the Information Workbench system. The component envisages two ways of utilising results of R processing:
- Visualising a chart built in R as a widget.
- Visualizing a data frame built in R in a table widget.
- Converting a data frame generated in R into a set of RDF triples to be stored in the RDF repository.
- See Installation.html inside the source distribution
Get the latest version at: