Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Given the regressions we identified in PDFBox 1.8.7, we should upgrade to 1.8.8 as soon as it is ready. I'm tempted to call this a blocker on Tika 1.7. Let's use this issue to carry on the discussion of regression testing (if any further discussion is necessary) or any other prep that needs to happen before 1.8.8's release.
Attachments
Attachments
Issue Links
- is related to
-
PDFBOX-2385 inline image with EI at the end incorrectly parsed
- Closed
-
PDFBOX-2421 Poor text extraction and rendering of file with non embedded type1 font
- Closed
-
PDFBOX-2449 Character missing in text extraction
- Closed
-
PDFBOX-2493 OOM with corrupt PDF file
- Closed
-
PDFBOX-2523 IOException: Error: Expected a long type at offset 1218571, instead got 'xref'
- Closed
-
PDFBOX-2533 Poor rendering with non-sequential parser
- Closed
-
PDFBOX-2534 Less pages shown with the non-sequential parser
- Closed
-
PDFBOX-2376 Small regression in text extraction with PDFBox 1.8.7 vs. 1.8.6
- Closed
-
PDFBOX-2377 Apparent regression in character mapping in a few files from govdocs1
- Closed
-
PDFBOX-2527 IOException: Negative seek offset in NonSequentialPDFParser
- Closed
-
PDFBOX-2528 IOException: Object must be defined and must not be compressed object: 0:0
- Closed
-
TIKA-1419 Upgrade to PDFBox 1.8.7
- Closed
- relates to
-
TIKA-1467 pdf:encrypted:false with encrypted pdf
- Open