Uploaded image for project: 'Daffodil'
  1. Daffodil
  2. DAFFODIL-1710

Apache Tika integration

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • None
    • None
    • API, Integrations
    • None

    Description

      Daffodil's parser could be encapsulated with the Apache Tika APIs allowing any DFDL-described format to be mined for text content in the Tika way.

      Probably this would want to be schema-aware in that Tika events would not want to be reported for numeric content, but only text content.

      Attachments

        Activity

          People

            Unassigned Unassigned
            mbeckerle Mike Beckerle
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: