Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3716

Add metadata element for all parsers that processed a file

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.4.0
    • None
    • None

    Description

      We currently have a "parsed by" data element in the metadata, but this only works for the initial container file. It would be useful to record all parsers that touched a file and its embedded files. This information is recorded in the RecursiveParserWrapper – /rmeta, -J – but it would also be useful for the legacy e.g. /tika endpoints.

      We recognize that this information will be added to the container file's metadata after the full parse and will not appear in the xhtml markup because of the way the XHTMLHandler works. However, it will appear in the json output of the /tika endpoint and for those calling Tika programmatically.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tallison Tim Allison
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: