Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Later
-
None
-
None
-
None
Description
The code below will output text for a pdf. Words that are on different lines are concatenated together
BodyContentHandler handler = new BodyContentHandler();
byte[] bytes = IOUtils.toByteArray(new FileInputStream(new File("resume.pdf")));
new PDFParser().parse(new ByteArrayInputStream(bytes), handler, new Metadata(), new ParseContext());
System.out.println(handler.toString());
Attachments
Attachments
Issue Links
- is blocked by
-
TIKA-1285 Upgrade to PDFBox 2.0.0 when available
- Closed