Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-143

Add ParsingReader

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.2
    • parser
    • None

    Description

      Lucene Java takes a Reader as input when including content in a search index. Tika instead generates SAX events or (with WriteOutContentHandler) writes content to a Writer, which requires extra work when integrating with Lucene and other tools that expect a Reader.

      To cover that case we should implement a ParsingReader class that takes an InputStream and a Tika Parser and returns the parsed text content through the Reader interface.

      Attachments

        Activity

          People

            jukkaz Jukka Zitting
            jukkaz Jukka Zitting
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: