PDFBox
  1. PDFBox

Text extraction

Summary

Issues: Unresolved

Key Summary Due Date
Bug PDFBOX-2252 PDFTextStripper has problem with documents with mixed language directions
Bug PDFBOX-2749 Annotations character bounding boxes size 3 times higher than expected
Bug PDFBOX-448 Columns in text not extracted separately

View Issues

Issues: Updated recently

Key Summary Updated
Bug PDFBOX-2272 Can't extract vertical text correctly
Bug PDFBOX-2792 Text extraction ignores bookmarks
Bug PDFBOX-2867 Correct use of Float.NaN

View Issues

Versions: Unreleased

Name Release date
Unreleased 1.8.11  
Unreleased 2.0.0  
Unreleased 2.1.0  
Unreleased 3.0.0