Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
None
-
None
-
None
Description
Creates a sis-tika module as a bridge between Apache SIS and Apache Tika. Tika is a generalized content detection and analysis library. One of its key components is a generic metadata container, that is typed. Tika can work before Lucene in order to represents data from various sources (PDF, Office, TIFF, etc.) in a uniform way that Lucene can index, but isn't dependent on Lucene.
The most obvious SIS parts that can be linked to Tika are the org.apache.sis.metadata.iso packages. This SIS metadata module addresses specifically the ISO 19115 metadata, while Tika is more generic. Consequently we should be able to map all SIS metadata to Tika, but the converse may not be always possible.