Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1120

Enable direct use of org.apache.tika.mime.MediaType.detect(...)

    XMLWordPrintableJSON

    Details

    • Type: Wish
    • Status: Closed
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: 1.3
    • Fix Version/s: None
    • Component/s: mime
    • Labels:
      None

      Description

      When using mime type detection, the classes allow following use:

      try (InputStream is = theInputStream;
      BufferedInputStream bis = new BufferedInputStream(is)

      { MimeTypes mt = new MimeTypes(); Metadata md = new Metadata(); md.add(Metadata.RESOURCE_NAME_KEY, theFileName); MediaType mediaType = mt.detect(bis, null); return mediaType.toString(); }

      When debugging this, the MimeTypes class instantiates its internal patterns with an empty MediaTypeRegistry. Therefore, getDefaultMimeTypes() is never called and thus tika-mimetypes.xml never read.

      Is it possible to enable direct usage of MediaType.detect()? Like adding a new constructor, where the MediaTypeRegistry can be set?

      If not, the code comments (or the documentation at https://tika.apache.org/0.10/detection.html) should point out that MimeTypes() should not instantiated directly for mime type detection, but the detectors should be used. Possibly, a minimum example should be added to make the usage clear.

      Following example works here

      try (InputStream is = theInputStream;
      BufferedInputStream bis = new BufferedInputStream(is)

      { AutoDetectParser parser = new AutoDetectParser(); Detector detector = parser.getDetector(); Metadata md = new Metadata(); md.add(Metadata.RESOURCE_NAME_KEY, theFileName); MediaType mediaType = detector.detect(bis, md); return mediaType.toString(); }

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              koppor Oliver Kopp
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: