Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.8.0-incubator, 1.3.1, 1.4.0
-
None
-
Win XP
Description
Hello ,
I have a PDF file with 1 page only, when I try to extract its text using :
String pageData = stripper.getText( pdfFile );
It ignores some Enter characters between lines, so the last word in the line and the first word in the next line appear as 1 word without spaces between them !!
While if I copy the PDF text manually from the PDF and paste it in a text editor, Enter characters appear after the same lines that caused the problem in PDFBox.
Please check the attached file as a sample.
Is there a way to fix this ?
Best regards ,