As reported by Noam S. on the user mailing list:
My problem is that when trying to getText(doc) form a certain section of the pdf using setStartBookmark(item) and setEndBookmark(item) I get all the text rather than just the text from the specified section.
WhiIe trying to resolve this I realized that the writeText(doc, outputStream) method always calls resetEngine() method. That will reset all the parameters and delete the bookmarks I set.
Another weird segment can be found in the trunk:
I also found another weird piece of code in the trunk, which would mean that text extraction would fail if start and end bookmarks are identical:
earlier, that segment was:
which makes more sense. The change was made last year in rev [ https://svn.apache.org/r1634252 ] as part of the pagetree refactoring.
I am writing a test to prevent this from breaking in the future.