Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2518

tika app outputs warnings by default

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.16
    • None
    • app
    • None

    Description

      upon downloading the latest tika and trying basic commands it spews unwanted warnings, which makes parsing output necessary.

      Example 1:

      java -jar tika-app-1.16.jar --list-detectors
      Dec 05, 2017 3:16:13 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
      WARNING: JBIG2ImageReader not loaded. jbig2 files will be ignored
      See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
      for optional dependencies.
      TIFFImageWriter not loaded. tiff files will not be processed
      See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
      for optional dependencies.
      J2KImageReader not loaded. JPEG2000 files will not be processed.
      See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
      for optional dependencies.
      
      Dec 05, 2017 3:16:13 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
      WARNING: org.xerial's sqlite-jdbc is not loaded.
      Please provide the jar on your classpath to parse sqlite files.
      See tika-parsers/pom.xml for the correct version.
      org.apache.tika.detect.DefaultDetector (Composite Detector):
        org.apache.tika.parser.microsoft.POIFSContainerDetector
        org.apache.tika.parser.pkg.ZipContainerDetector
        org.gagravarr.tika.OggDetector
        org.apache.tika.mime.MimeTypes
      

      Example 2:

      java -jar tika-app-1.16.jar --text my.xlsx
      Dec 05, 2017 3:00:22 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
      WARNING: JBIG2ImageReader not loaded. jbig2 files will be ignored
      See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
      for optional dependencies.
      TIFFImageWriter not loaded. tiff files will not be processed
      See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
      for optional dependencies.
      J2KImageReader not loaded. JPEG2000 files will not be processed.
      See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
      for optional dependencies.
      
      Dec 05, 2017 3:00:22 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
      WARNING: org.xerial's sqlite-jdbc is not loaded.
      Please provide the jar on your classpath to parse sqlite files.
      See tika-parsers/pom.xml for the correct version.
      INFO  As a convenience, TikaCLI has turned on extraction of
      inline images for the PDFParser (TIKA-2374).
      This is not the default option in Tika generally or in tika-server.
      As a convenience, TikaCLI has turned on extraction of
      inline images for the PDFParser (TIKA-2374).
      This is not the default option in Tika generally or in tika-server.
      

      The expected behavior is to return only the requested information. I do not see a switch to turn off or control unrequested warnings.

      I can't imagine this is the correct behavior. It is not documented, nor could I find why such output exists.

      Attachments

        Activity

          People

            Unassigned Unassigned
            rbrueske Ryan Brueske
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: