Details
-
Task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
RDF HDT is a compact data structure and binary serialization format for RDF that keeps big datasets compressed to save space while maintaining search and browse operations without prior decompression. This makes it an ideal format for storing and sharing RDF datasets on the Web.
Currently the Java Implementation only provides bindings for jena RIOT, with a license that does not enable it to be integrated into the main Sesame codebase, or any Apache codebase.
The idea consist on implementing an Apache licensed implementation of RDF HDT from scratch and support the Sesame RIO infrastructure (RDFParser/RDFWriter/RDFHandler).
The implementation would require to have good knowledge of Java programming, plus some basic understanding of parsers concepts and the RDF and HDT data models.