Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-612

Specify PDFBox options via ParseContext

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 0.9
    • 1.1
    • parser
    • None

    Description

      See https://issues.apache.org/jira/browse/TIKA-611. The options used by PDFBox are currently hardwritten in the PDFParser code, we will allow them to be specified via the ParseContext objects

      Attachments

        1. TIKA-612-testcase.patch
          2 kB
          Michael McCandless
        2. TIKA-612.patch
          7 kB
          Michael McCandless
        3. Tika-612.patch
          4 kB
          Julien Nioche
        4. testPDFTwoColumns.pdf
          56 kB
          Michael McCandless

        Issue Links

          Activity

            People

              mikemccand Michael McCandless
              jnioche Julien Nioche
              Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: