Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-612

Specify PDFBox options via ParseContext

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.9
    • Fix Version/s: 1.1
    • Component/s: parser
    • Labels:
      None

      Description

      See https://issues.apache.org/jira/browse/TIKA-611. The options used by PDFBox are currently hardwritten in the PDFParser code, we will allow them to be specified via the ParseContext objects

        Attachments

        1. testPDFTwoColumns.pdf
          56 kB
          Michael McCandless
        2. Tika-612.patch
          4 kB
          Julien Nioche
        3. TIKA-612.patch
          7 kB
          Michael McCandless
        4. TIKA-612-testcase.patch
          2 kB
          Michael McCandless

          Issue Links

            Activity

              People

              • Assignee:
                mikemccand Michael McCandless
                Reporter:
                jnioche Julien Nioche
              • Votes:
                2 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: