The Data Manifest is a Python package that implements our Data manifest format specification.
Source code: gitlab.com/somanyaircraft/datamanifest/
PyPI package: pypi.org/project/datamanifest/
The package source code is available here. It is also available on PyPI and can be installed simply via
pip install datamanifest
It can also be installed directly from, say, PyCharm. After installation, you can use
import datamanifest
in your programs.
The SourceManager class implements a component that can gather information from data manifests. The constructor parameters are:
rdflib.Graph instance which is assumed to contain a SHACL shapes graph against which manifests are validated; if not provided, an internal shpes graph is used.Reads the manifest at the parameter url (an URL). IF the parameter try_default is True (the default), and the manifest file is not found, uses the provided default sources (see above) as a manifest instead.
The mentod returns an rdfhelpers.Composable instance containing the manifest graph.
This method is typically only called internally.
Calls readSources to read the manifest graph; if the manifest contains references to other manifests, their contents are recursively traversed and included in the graph. Returns a list of SourceSpec instances describing all the sources found.
If the parameter validate is True (the default), validates each manifest using SHACL.
Collects all declared namespaces (from the manifest pointed to by the parameter url) and returns a dict mapping namespace prefixes to namespace URIs.
If the parameter namespaces is provided, this should be a dict, and namespaces discovered are added to it; the dict is returned.
Collects and returns all the manifest file URLs as a list. The parameters are the same as for collectSources.
A typical way to use the functionality in this package is to instantiate SourceManager, call collectSources() with the URL of the manifest to be read, and iterate over the resulting list of SourceSpec instances. It is obviously up to the caller to decide what to do with the information contained in these instances (OINK uses the information to load or re-load sources, canonicalize namespace URIs, and declare namespace prefixes).