Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-652

Custom metadata from more formats

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.9
    • 0.10
    • parser
    • None

    Description

      Currently, Tika handles custom metadata from Open Document files. Any custom metadata is returned with a custom: prefix (see OpenOfficeParserTest#testOO2Metadata for example)

      Microsoft file formats don't include custom metadata in the parsing, and nor does PDF

      Assuming we're happy with including custom metadata from Documents in the parsing step, with the custom: prefix, I'll go ahead and add it for the Microsoft (ole2 and ooxml) and PDF parsers

      Attachments

        Issue Links

          Activity

            People

              nick Nick Burch
              nick Nick Burch
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: