Description
[imported from SourceForge]
http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1565617
Originally submitted by gagravarr on 2006-09-26 03:30.
I'm trying to extract text from a pdf file
(http://www.cifor.cgiar.org/mla/download/publication/mozambique.pdf),
but I'm getting an IndexOutOfBoundsException on it:
Exception in thread "main"
java.lang.IndexOutOfBoundsException: Index: 4, Size: 4
at
java.util.ArrayList.RangeCheck(ArrayList.java:546)
at java.util.ArrayList.get(ArrayList.java:321)
at
org.pdfbox.util.operator.Concatenate.process(Concatenate.java:69)
at
org.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:494)
at
org.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:207)
at
org.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:160)
at
org.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:355)
at
org.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:268)
at
org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:220)
at
org.pdfbox.ExtractText.main(ExtractText.java:237)
I've tried with 0.7.2, and 0.7.3-dev-20060920, and I
get the same exception from both versions.
Nick