Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-612

Specify PDFBox options via ParseContext

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 0.9
    • 1.1
    • parser
    • None

    Description

      See https://issues.apache.org/jira/browse/TIKA-611. The options used by PDFBox are currently hardwritten in the PDFParser code, we will allow them to be specified via the ParseContext objects

      Attachments

        1. testPDFTwoColumns.pdf
          56 kB
          Michael McCandless
        2. Tika-612.patch
          4 kB
          Julien Nioche
        3. TIKA-612.patch
          7 kB
          Michael McCandless
        4. TIKA-612-testcase.patch
          2 kB
          Michael McCandless

        Issue Links

          Activity

            People

              mikemccand Michael McCandless
              jnioche Julien Nioche
              Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: