Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-928

Separation of Tika Core Properties From Metadata Processing

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.1
    • 2.0.0-BETA, 2.1.0
    • metadata
    • None

    Description

      The Metadata class is a bit overloaded with both processing and core Tika properties defined in the same place.

      Separating the core properties into a TikaCoreProperties class which contains only composite properties which reference other standards like DublinCore will allow the Metadata class to focus on processing and ease the transition from the now deprecated String properties that were directly included in Metadata via the implements clause.

      This will also allow us to cherry pick only the properties we want from a standard as Tika core properties rather than having to include all the properties in a standard's interface, some of which may be more specific to a particular content type than we want.

      Attachments

        1. tika-core-properties-metadata-refactor-parsers.diff
          131 kB
          Ray Gauss II
        2. tika-core-properties-metadata-refactor-core.diff
          6 kB
          Ray Gauss II
        3. tika-core-properties.diff
          4 kB
          Ray Gauss II

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rgauss Ray Gauss II
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: