Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-388

Don't trust streams that claim mark support

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 0.7
    • parser
    • None

    Description

      As seen on tika-dev@ and in JCR-2576, there are some InputStream implementations that claim to support the mark feature, but lose the mark as soon as the end of stream has been reached. There's no way for a client to detect such behaviour, so it's probably best for Tika to always use BufferedInputStream to wrap incoming streams when mark support is needed. This may cause one layer of extra buffering, but avoids problems with such broken streams.

      Attachments

        Activity

          People

            jukkaz Jukka Zitting
            jukkaz Jukka Zitting
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: