Created attachment 22379 [details] Contains JUnit test class and documents used for testing. The text contained in the notes inserted at the end of a page of a word 2007 document is not extracted. Find in attachments the JUnit test class and the documents used for testing. We expected to extract the words "testdoc" and "test phrase". Notes on the attached documents: - the documents "classic_FootNote.docx" and "form_FootNotes.docx" contain the words "testdoc" and "test phrase" in the notes inserted at the end of a page of the documents. "TestUnitPoi35Filter.java" is the JUnit class.
I did create patch that adds text extraction for docx footnotes. Please review my solution, I'm going to add endnotes extraction in the same way.
Created attachment 23975 [details] src/scratchpad/testcases/org/apache/poi/hwpf/data/snoska.docx
Created attachment 23976 [details] patch
Created attachment 24000 [details] Additinal patch that add text extraction of footnotes in tables
Created attachment 24001 [details] src/scratchpad/testcases/org/apache/poi/hwpf/data/Table.docx
Maxim, XWPFFootnote.java is missing in the patch. Please attach, I'm going to look into it this weekend. Regards, Yegor
Created attachment 24003 [details] XWPFFootnote.java oops :-)
Created attachment 24004 [details] Additional patch for endnotes
Created attachment 24005 [details] src/scratchpad/testcases/org/apache/poi/hwpf/data/A Nepalese name for Tilaka.docx
Patch applied to svn trunk with some minor tweaks. Thanks, Yegor