Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1096

CompressorParser: Add support for handling concatenated InputStreams

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.4
    • 1.4
    • parser
    • None

    Description

      COMPRESS-220 added support for CompressorStreamFactory to return an InputStream with decompressConcatenated set to true. Today, Tika uses the CompressorStreamFactory without this option, which caused me some problems parsing some gzipped files that required this option.

      Today I have to do some pre-processing on the InputStreams before I send them to Tika; it would be great if Tika could handle this for me.

      I wrote up a quick patch that adds this option; I'll attach it soon.

      Attachments

        1. TIKA-1096.patch
          3 kB
          Gregory Chanan

        Activity

          People

            chrismattmann Chris A. Mattmann
            gchanan Gregory Chanan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: