Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.0.12, 2.0.13
-
None
Description
This was detected by looking at the result of a regression test thankfully done by tallison@apache.org (see at the end of PDFBOX-4371) for his work in TIKA-2779, there were many new words but some didn't have the spaces. This is the result of a bad angle (180 instead of 0), because the font matrix hasn't been considered, for type 3 fonts this is often a rotation or a flip.
Attachments
Attachments
Issue Links
- relates to
-
PDFBOX-4371 Improve ExtractText utility so that it can extract rotated text automatically
- Closed
-
TIKA-2779 Integrate/parameterize new rotated text handling in PDFBox
- Resolved