[PDFBOX-448] Columns in text not extracted separately - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Not A Problem
Affects Version/s: 1.8.7, 2.0.0
Fix Version/s: None
Component/s: Text extraction
Labels:
- beads

Description

The paper that is attached to ~~PDFBOX-80~~ has two columns of text, but the extracted text is not separated by column. Instead it combines the text in each column on each line.

PDFTextStripper has a notion of columns and "articles / beads", but they are not being used with this file.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

WBPaper00003120.pdf
09/Aug/10 21:07
407 kB
Arun Rangarajan

Activity

People

Assignee:: Unassigned

Reporter:: Brian Carrier

Votes:: 2 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 01/Apr/09 15:21

Updated:: 09/Nov/15 16:44

Resolved:: 09/Nov/15 16:44