Bug 47535 - [PATCH] Exception on WordExtractor.getFootnoteText
Summary: [PATCH] Exception on WordExtractor.getFootnoteText
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: 3.5-dev
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-07-15 05:37 UTC by Maxim Valyanskiy
Modified: 2009-07-18 03:03 UTC (History)
0 users



Attachments
patch (1.59 KB, patch)
2009-07-15 05:43 UTC, Maxim Valyanskiy
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Maxim Valyanskiy 2009-07-15 05:37:32 UTC
getFootnoteText can fail with different exceptions caused by incorrect range calculations for empty footnote block:

java.lang.ArrayIndexOutOfBoundsException: 60
	at org.apache.poi.util.LittleEndian.getShort(LittleEndian.java:46)
	at org.apache.poi.hwpf.sprm.SprmOperation.<init>(SprmOperation.java:54)
	at org.apache.poi.hwpf.sprm.SprmIterator.next(SprmIterator.java:45)
	at org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.uncompressPAP(ParagraphSprmUncompressor.java:58)
	at org.apache.poi.hwpf.model.PAPX.getParagraphProperties(PAPX.java:130)
	at org.apache.poi.hwpf.usermodel.Range.getParagraph(Range.java:819)
	at org.apache.poi.hwpf.extractor.WordExtractor.getParagraphText(WordExtractor.java:140)
	at org.apache.poi.hwpf.extractor.WordExtractor.getFootnoteText(WordExtractor.java:121)

or

java.lang.IllegalArgumentException: The end (1062) must not be before the start (2029)
	at org.apache.poi.hwpf.usermodel.Range.sanityCheckStartEnd(Range.java:244)
	at org.apache.poi.hwpf.usermodel.Range.<init>(Range.java:178)
	at org.apache.poi.hwpf.usermodel.Paragraph.<init>(Paragraph.java:98)
	at org.apache.poi.hwpf.usermodel.Range.getParagraph(Range.java:827)
	at org.apache.poi.hwpf.extractor.WordExtractor.getParagraphText(WordExtractor.java:139)
	at org.apache.poi.hwpf.extractor.WordExtractor.getFootnoteText(WordExtractor.java:120)
Comment 1 Maxim Valyanskiy 2009-07-15 05:41:29 UTC
Patch does two things:

1) Adds footnote/endnote extraction to WordExtractor.getText(). This breaks TestWordExtractor since POI test suite already contains file with this problem

2) Fixes bug in Range.findRange (this also fixes TestWordExtractor test case)
Comment 2 Maxim Valyanskiy 2009-07-15 05:43:50 UTC
Created attachment 23987 [details]
patch
Comment 3 Yegor Kozlov 2009-07-18 03:03:56 UTC
Patch applied in r795333

Thanks,
Yegor