Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3571

Add an interface for rendering engines

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.4.1
    • None
    • None

    Description

      We've now seen a few requests for extracting text and rendering PDFs, and certainly it might be useful to have alternatives for rendering files (e.g. this Alfresco study), including MSOffice or at least PPTx...

      And there are cases where users don't want the rendered images, but they do want OCR to be run against the rendered images.

      I doubt I'll have a chance to work on this for a while, but I wanted to open an issue for discussion.

      Attachments

        Activity

          People

            tallison Tim Allison
            tallison Tim Allison
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: