Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-726

PDFTextStripper: allow access to currentPageNo variable

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.0
    • Fix Version/s: 1.2.0
    • Component/s: Text extraction
    • Labels:
      None

      Description

      I've extended org.apache.pdfbox.util.PDFTextStripper and I'm using it to perform a 2-pass extraction over a document. However, the second pass doesnt happen because I am unable to alter the variable currentPageNo, which maintains the current page number in the pdf document. It is a variable with access modifier of private, and only a get method is provided.

      The only time currentPageNo is set to 0 is via 'writePage(PDDocument, OutputStream)' which I am overriding/not calling.

      2 possible resolutions:

      • make currentPageNo protected instead of private (preferred)
      • add setCurrentPageNo method

      Thank you,
      Ryan

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rnideffer Ryan Nideffer
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: