Tiny RML

The package tinyrml is an implementation of a subset of RML and R2RML with some helpful extended features. It is intended to be used as a Python package/library, and accepts any Python iterable[dict] as input.

Short description of the Tiny RML API can be found here.

Source code: gitlab.com/somanyaircraft/tinyrml/

PyPI package: pypi.org/project/tinyrml/

Comparison with RML and R2RML

Tiny RML only offers a subset of the functionality of RML and R2RML, but it also extends these languages in ways afforded by the Python implementation.

Limitations

Tiny RML has the following limitations:

Extensions

The package supports the following extensions to R2RML (note that a special namespace rre: is reserved for these extensions):

Tiny RML was originally part of rdfhelpers, but is now split off as its own project. It has no dependencies to rdfhelpers.

Installation

Tiny RML can be installed from PyPI:

pip install tinyrml

API

In addition to the class Mapper (explained below), the tinyrml package exposes RR, RML, and RRE as the namespaces (instances of rdflib.Namespace) for R2RML, RML, and the Tiny RML extensions, respectively. By convention, we use the prefixes rr:, rml:, and rre: for these.

class Mapper

Tiny RML exposes the class Mapper which is the basic implementation of the mapping functionality. Instances of Mapper represent individual mappings (i.e., specific mapping definitions). The class constructor takes the following parameters:

def Mapper.process(self, rows, result_graph=)

Invokes a mapper. The parameter rows is an iterable of dicts used as the "rows" to be mapped; dictionary keys take the role of column names. If provided, result_graph= is a graph where results are added; otherwise a new graph is created. Regardless, the result graph is returned.

Mapper.processCSVFile(self, source, result_graph=, skip_unicode_marker=)

Takes a CSV file (provided as the parameter source and passed to open) and maps its contents. The parameter result_graph is passed to process. If skip_unicode_marker is True (the default), the initial character in the source file is skipped (otherwise it becomes part of the name of the first column). The result graph is returned.

Template Formatting

Template strings (values of rr:template) do not support full JSONPath references. Paths like a.b.c are supported (see below); other features of JSONPath may be added in the future. The template mechanism is currently implemented using the string.Formatter class, so technically the format string syntax is available; this is likely to change in the future, though.

JSON object "flattening"

JSON objects, when processed, are first "flattened" into non-nested dicts. For example, the object

{ "a": {"b": 1}, "c": 2 }

becomes

{ "a.b": 1, "c": 2 }

and now the simplistic JSONPath "a.b" could be used in templates as a field reference.

@classmethod def Mapper.flatten(cls, data)

"Flattening" is done using this method. Subclasses of Mapper can override this if they so choose. The default implementation recursively flattens the data (a dict) and returns flattened data (also a dict).

Recipies

If you have an RDF source file (say, mappings.ttl) with multiple mappings (i.e., triples maps), you can parse the file and create multiple Mapper instances. For example, assuming triples maps ex:tm_1 and ex:tm_2 (corresponding to EX.tm_1 and EX.tm_2), you could do this:

mappings = rdflib.Graph()
mappings.parse("mappings.ttl")
mapping_1 = tinyrml.Mapper(mappings, triples_map_uri=EX.tm_1)
mapping_2 = tinyrml.Mapper(mappings, triples_map_uri=EX.tm_2)

To create an rdflib.Composable instance by mapping some tabular data, you can do the following (assuming mapper contains a Mapper instance and rows contains data to be mapped):

composable = rdflib.Composable(mapper.process(rows))