Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Bug
-
2.0.21
-
None
-
None
-
Windows 10, AdoptOpenJDK 11.0.8, 64-bit
Description
Hello everyone,
PDFBox is not able to extract text from the attached document. It can only extract the first page with "Please wait...". The other pages are missing. I've also tried loading it in PDFDebugger, but it shows the first page only. I can open the document fine in Adobe and see all the text fine. I suspect it's some kind of dynamically generated content.
Sample code to reproduce the issue:
try (PDDocument document = PDDocument.load(new File("c0015_re_1375881383129_eng[1].pdf"), "")) { PDFTextStripper stripper = new PDFTextStripper(); String text = stripper.getText(document); System.out.println("Text: " + text); }
Thanks.