1. PDFBox

Text extraction


Issues: Unresolved

Key Summary Due Date
Bug PDFBOX-2252 PDFTextStripper has problem with documents with mixed language directions
Bug PDFBOX-2749 Annotations character bounding boxes size 3 times higher than expected
Bug PDFBOX-448 Columns in text not extracted separately

View Issues

Issues: Updated recently

Key Summary Updated
Bug PDFBOX-2908 PDFTextStripper.writeText is slow
Improvement PDFBOX-2912 PDFTextStripper: positionX < previous positionX case
Bug PDFBOX-2272 Can't extract vertical text correctly

View Issues

Versions: Unreleased

Name Release date
Unreleased 1.8.11  
Unreleased 2.0.0  
Unreleased 2.1.0  
Unreleased 3.0.0