Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
1.8.7, 1.8.8, 2.0.0
Description
On at least one file in govdocs1, less text is being extracted with PDFBox 1.8.7 than was extracted with 1.8.6. When running the app.jar with ExtractText, 1.8.7 is not extracting:
Designated Counties No Designation Individual Assistance All counties are eligible ITS Mapping & Analysis CenterWashington, DC 05/09/08 -- 09:36 AM EDT Source: Disaster Federal Registry Notice05/08/2008 Location Map MapID 196d109cd27 for Hazard Mitigation
from govdocs1's 894770.pdf.
Attachments
Attachments
Issue Links
- is related to
-
PDFBOX-2385 inline image with EI at the end incorrectly parsed
- Closed
-
TIKA-1419 Upgrade to PDFBox 1.8.7
- Closed
- relates to
-
TIKA-1442 Upgrade to PDFBox 1.8.8
- Closed