Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1887

Add new mimetype for file extensions .po

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core, mime
    • Labels:
    • Flags:
      Patch

      Description

      Hi,

      While analyzing the Trec DD polar data, we came across files that were classified as octet-stream.
      On using content based algorithms such as BFA, BFCC and FHT we were able to determine more magic bytes for certain files.

      The GNU gettext toolset is used by programmers and translators at producing, updating and using translation files, mainly those PO files which are textual, editable files.
      We suggest a new mimetype as text/po to be added to the existing mime repository of Tika.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              manalishah.91@gmail.com Manali Shah
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified