Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
Description
On TIKA-2257, ccreutzig shared a file that triggers PDFBox to flip the order of the fatha. I suspect this is happening in normalizeAdd within PDFTextStripper, but I'm not familiar enough with the code to diagnose and fix.
I confirmed this is still happening in trunk.
Triggering file and the start of a diagnosis is available on the Tika issue.
Attachments
Attachments
Issue Links
- is depended upon by
-
TIKA-2257 Arabic vowel marks displaced when reading from PDF
-
- Open
-