Details
Description
Rya is a scalable RDF triple store and supports SPARQL. Consider the scenario where collaboration is desired/required between multiple Rya instances. Each instance would store its own data, but we are interested in performing global search (SPARQL query), over the data available in the union of all instances. Implement the operations needed at each instance in order to support such a global search. Implement a basic federated coordinator that would produce the final answer to the global query.
Sub-tasks that need to be implemented:
-Implement a federation coordinator: the federation coordinator will have to:
1. parse the SPARQL query; openRDF provides utilities for that, and Rya already has this implemented - should be able to re-use existing code
2. generate a federated query execution plan which includes source selection (where each triple pattern / sub-graph should be evaluated). OpenRDF has utilities to generate a query plan for a SPARQL query (which Rya currently uses), and might have utilities to generate a federated query plan. For the initial implementation for source selection we can assume either that a global catalog exists, or that SPARQL ASK queries can be sent to the federation members to decide the source for each triple pattern in the SPARQL query.
3. query plan optimization: this part is optional for the initial implementation. If time permits, cardinality-based optimizations should be considered, as well as different query plans (for example bushy plans vs. left-deep plans)
4. query plan execution: implement at least one join algorithm to be able to combine the information received from the federated sources. Rya already implements join algorithms, so the code might be re-used
-Implement a mediator layer for Rya to mediate between the coordinator and the local instance of Rya. This layer should provide functionality to receive a SPARQL query, or triple pattern, possibly with bindings, and provide an answer. As Rya already provides this functionality, the mediator layer might only need to expose the functionality and communicate with the federation coordinator
-Implement federated inference: this part is likely outside of the scope of this project and requires additional research