I've extended org.apache.pdfbox.util.PDFTextStripper and I'm using it to perform a 2-pass extraction over a document. However, the second pass doesnt happen because I am unable to alter the variable currentPageNo, which maintains the current page number in the pdf document. It is a variable with access modifier of private, and only a get method is provided.
The only time currentPageNo is set to 0 is via 'writePage(PDDocument, OutputStream)' which I am overriding/not calling.
2 possible resolutions:
- make currentPageNo protected instead of private (preferred)
- add setCurrentPageNo method