Created attachment 25934 [details] Fixed version When you ParagraphProperties.getLvl() for any style sheet, that is a part of the outline, it returns a value from 0 to 8 for Heading 1 to Heading 9. But for normal styles it returns 0, which makes it indistinguishable from each other. The MICROSOFT OFFICE WORD 97-2007 BINARY FILE FORMAT SPECIFICATION states the following: The standard PAP is all zeros except: fWidowControl 1 fMultLineSpace 1 dyaLine 240 twips Lvl 9 I solved this problem by changing the initial value for property org.apache.poi.hwpf.model.types.PAPAbstractType#field_58_lvl to 9. After this, I can read the same outline levels as Word shows me: getLvl() returns values from 0 to 9; 9 is Body text and 0..8 are outline levels 1..9. Attached is a modified version of PAPAbstractType.java, which also alters the initial value for the property field_17_fWidowControl according to specification. Properties for fMultLineSpace and dyaLine are not there, it is probably for some newer version of Word.
Any chance you could do a quick unit test, which shows it getting the correct values for the headings even after the change, along with then getting the correct one for normal text with the change?
I studied the problem further and found several other problems: - the ParagraphSprmUncompressor class sets value for operation 0x40 incorrectly to ilvl instead of to lvl, according to binary format specification - the lvl in operation 0x40 should be set always, not only when istd is between 1..9. The binary format specification says that, but in Word you can set outline level for any paragraph except Heading 1..9, and with this condition in place POI does not see it. When it is commented out, everything seems OK, see testcase. - I added the getLvl() method to Paragraph, which enables to read outline level at the paragraph level. Attached is the JUnit testcase. This is its output in current version (3.7b2, style level shown in place of paragraph level, as current version does not support reading level at paragraph level): Style level: 0, paragraph level: 0, text: Heading 1 Style level: 0, paragraph level: 0, text: Heading 2 Style level: 0, paragraph level: 0, text: Heading 3 Style level: 0, paragraph level: 0, text: Heading 4 Style level: 0, paragraph level: 0, text: Heading 5 Style level: 0, paragraph level: 0, text: Heading 6 Style level: 0, paragraph level: 0, text: Heading 7 Style level: 0, paragraph level: 0, text: Heading 8 Style level: 0, paragraph level: 0, text: Heading 9 Style level: 0, paragraph level: 0, text: Body text with unchanged outline level Style level: 0, paragraph level: 0, text: Body text with outline level changed to Level 1 Style level: 0, paragraph level: 0, text: Body text with outline level changed to Level 5 This is the output after applying attached patches to version 3.7b2: Style level: 0, paragraph level: 0, text: Heading 1 Style level: 1, paragraph level: 1, text: Heading 2 Style level: 2, paragraph level: 2, text: Heading 3 Style level: 3, paragraph level: 3, text: Heading 4 Style level: 4, paragraph level: 4, text: Heading 5 Style level: 5, paragraph level: 5, text: Heading 6 Style level: 6, paragraph level: 6, text: Heading 7 Style level: 7, paragraph level: 7, text: Heading 8 Style level: 8, paragraph level: 8, text: Heading 9 Style level: 9, paragraph level: 9, text: Body text with unchanged outline level Style level: 9, paragraph level: 0, text: Body text with outline level changed to Level 1 Style level: 9, paragraph level: 4, text: Body text with outline level changed to Level 5 Specification used: http://download.microsoft.com/download/0/B/E/0BE8BDD7-E5E8-422A-ABFD-4342ED7AD886/Word97-2007BinaryFileFormat(doc)Specification.pdf
Created attachment 25946 [details] Patched files and test case
I found a mistake in my patch. I added the initial values to PAPAbstractType, but they already were in ParagraphProperties's constructor. But the initial value for lvl was incorrectly assigned to ilvl (which are two different properties) - that is the problem to be fixed. I will attach new patch. Please apply the patch to version 3.7 so we could use the final version without patching.
Created attachment 26000 [details] Patched files and test case version 2
Created attachment 26020 [details] Diff file to be applied to repository This is a patch created by following the rules in the POI Contribution Guidelines, along with a testcase. It is created upon the current repository head version. Please check it and possibly merge it.
Created attachment 26021 [details] New files not included in the diff
Thanks, patch applied in r998897.