Details
-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Not A Problem
-
Affects Version/s: 2.0.5, 2.0.6
-
Fix Version/s: None
-
Component/s: Text extraction
-
Labels:
-
Environment:Windows 7 64-bit, Java
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
-
Flags:Patch
Description
Adobe Reader shows no problems with the attached PDF "DataDirect Connect for ODBC User's Guide and Reference.pdf".
First 256 characters of extracted text (char + hex code) from PDFTextStripper:
000d
000d
000d
000d
000d
000d
000d
000d
000d 0001 B 0042 O 004f E 0045 0001 4 0034 F 0046 R 0052 V 0056 F 0046 - 002d J 004a O 004f L 004c 0001 B 0042 S 0053 F 0046 0001 S 0053 F 0046 H 0048 J 004a T 0054 U 0055 F 0046
I have a few more PDFs with the same symptom.