47535 – [PATCH] Exception on WordExtractor.getFootnoteText

Bug 47535 - [PATCH] Exception on WordExtractor.getFootnoteText

Summary: [PATCH] Exception on WordExtractor.getFootnoteText

Status:	RESOLVED FIXED

Alias:	None

Product:	POI
Classification:	Unclassified
Component:	HWPF (show other bugs)
Version:	3.5-dev
Hardware:	PC Linux

Importance:	P2 normal (vote)
Target Milestone:	---
Assignee:	POI Developers List

URL:
Keywords:

Depends on:
Blocks:

Reported:	2009-07-15 05:37 UTC by Maxim Valyanskiy
Modified:	2009-07-18 03:03 UTC (History)
CC List:	0 users

Attachments
patch (1.59 KB, patch) 2009-07-15 05:43 UTC, Maxim Valyanskiy	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Maxim Valyanskiy 2009-07-15 05:37:32 UTC

getFootnoteText can fail with different exceptions caused by incorrect range calculations for empty footnote block:

java.lang.ArrayIndexOutOfBoundsException: 60
	at org.apache.poi.util.LittleEndian.getShort(LittleEndian.java:46)
	at org.apache.poi.hwpf.sprm.SprmOperation.<init>(SprmOperation.java:54)
	at org.apache.poi.hwpf.sprm.SprmIterator.next(SprmIterator.java:45)
	at org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.uncompressPAP(ParagraphSprmUncompressor.java:58)
	at org.apache.poi.hwpf.model.PAPX.getParagraphProperties(PAPX.java:130)
	at org.apache.poi.hwpf.usermodel.Range.getParagraph(Range.java:819)
	at org.apache.poi.hwpf.extractor.WordExtractor.getParagraphText(WordExtractor.java:140)
	at org.apache.poi.hwpf.extractor.WordExtractor.getFootnoteText(WordExtractor.java:121)

or

java.lang.IllegalArgumentException: The end (1062) must not be before the start (2029)
	at org.apache.poi.hwpf.usermodel.Range.sanityCheckStartEnd(Range.java:244)
	at org.apache.poi.hwpf.usermodel.Range.<init>(Range.java:178)
	at org.apache.poi.hwpf.usermodel.Paragraph.<init>(Paragraph.java:98)
	at org.apache.poi.hwpf.usermodel.Range.getParagraph(Range.java:827)
	at org.apache.poi.hwpf.extractor.WordExtractor.getParagraphText(WordExtractor.java:139)
	at org.apache.poi.hwpf.extractor.WordExtractor.getFootnoteText(WordExtractor.java:120)

Comment 1 Maxim Valyanskiy 2009-07-15 05:41:29 UTC

Patch does two things:

1) Adds footnote/endnote extraction to WordExtractor.getText(). This breaks TestWordExtractor since POI test suite already contains file with this problem

2) Fixes bug in Range.findRange (this also fixes TestWordExtractor test case)

Comment 2 Maxim Valyanskiy 2009-07-15 05:43:50 UTC

Created attachment 23987 [details]
patch

Comment 3 Yegor Kozlov 2009-07-18 03:03:56 UTC

Patch applied in r795333

Thanks,
Yegor