Bug 30235 - [PATCH] Fixes for unicode/tables etc for real world documents
Summary: [PATCH] Fixes for unicode/tables etc for real world documents
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HDF (show other bugs)
Version: unspecified
Hardware: Other other
: P3 critical (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-07-21 16:18 UTC by Piers
Modified: 2004-11-16 19:05 UTC (History)
0 users



Attachments
ListLevel.java, ListTables.java, SectionTable.java, TextPiece.java, TextPiecetable.java, ParagraphSprmUncompressor.java, TableSprmUncompressor.java, CharacterRun.java, Paragraph.java, Range.java, Table.java, TableRow.java (23.14 KB, text/plain)
2004-07-21 16:21 UTC, Piers
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Piers 2004-07-21 16:18:17 UTC
A merge of the latest code from Goss Interactive that includes the fixes that 
we rolled into our latest product release. This has been tested against a 
series of testcase word documents that were supplied by clients or created 
specifically in-house. 

Platforms and formats tested include: Office 97/2k/xp, Mac, OpenOffice. All 
testcases can now be read correctly. Writing is not totally transparent and 
still needs some work. From our further testing, previous HWPF write 
functionality appears to remain unaffected.


Files altered:
==============

hwpf/model/ListLevel.java
-------------------------
- fix for listlevel() - papx and chpx arrays were being copied in the wrong 
order

- added getLevelProperties()


hwpf/model/ListTables.java
--------------------------
- added getListData()


hwpf/model/SectionTable.java
----------------------------
- fixed CPtoFC() to accomodate non-contigous textpieces


hwpf/model/TextPiece.java
-------------------------
- added CP_start property and accessor used by CPtoFC in model/SectionTable.java


hwpf/model/TextPiecetable.java
------------------------------
- fix for constructor to take into account non-contigous textpieces


hwpf/sprm/ParagraphSprmUncompressor.java
----------------------------------------
- fix for tabs being read as INTs instead of the SHORTs that they are


hwpf/sprm/TableSprmUncompressor
-------------------------------
- fix for TC entries not always being present in the Word file


hwpf/usermodel/CharacterRun.java
--------------------------------
- changed to use updateSprm() rather than addSprm() to match code elsewhere 
that prevents additional sprms being created if they already exist

- added accessors for properties that Word uses when handling embedded objects 
such as Hyperlinks, Pictures, OleObjects etc


hwpf/usermodel/Paragraph.java
-----------------------------
- added accessors for Ilfo and Ilvl used by Word for numbered lists


hwpf/usermodel/Range.java
-------------------------
- fixed text() to correctly return unicode text

- fixed findRange() to prevent loop indexes going out of bounds, causing an 
exception

- fixed getTable() to cope with tables that start at the beginning of a section 
such that they don't get merged with preceeding tables

- fixed getTable() to return a table at the correct tablelevel


hwpf/usermodel/Table.java
-------------------------
- minor code tidy


hwpf/usermodel/TableRow.java
----------------------------
- removed constraint requiring levelNum==1 in constructor that doesn't work 
when dealing with documents that include sections
Comment 1 Piers 2004-07-21 16:21:52 UTC
Created attachment 12182 [details]
ListLevel.java, ListTables.java, SectionTable.java, TextPiece.java, TextPiecetable.java, ParagraphSprmUncompressor.java, TableSprmUncompressor.java, CharacterRun.java, Paragraph.java, Range.java, Table.java, TableRow.java
Comment 2 Glen Stampoultzis 2004-08-04 01:28:51 UTC
Since you're contribution is fairly large I think it would be preferable to get
you to sign a CLA before we commit your changes.  Since this is work on behalf
of your company it might be best to get them to sign a CLA also.

Individual CLA:

http://www.apache.org/licenses/cla.txt

Company CLA:

http://www.apache.org/licenses/cla-corporate.txt

Sorry to be a pain about CLA's and things but Apache tries hard to make sure
there are no legal issues surrounding the code it receives.
Comment 3 Piers Taylor 2004-08-04 18:59:46 UTC
No problem Glen,

Goss Interactive Ltd. have alredy signed a CLA which you should have on file. 
Ryan requested this right at the beginning. That CLA will cover the current 
patch since it was work carried out when I was working there.

I have downloaded the "Individual CLA" and will sign and return it to you ASAP.

>Sorry to be a pain...

Not at all.

Regards,
        Piers
Comment 4 Glen Stampoultzis 2004-08-24 12:50:35 UTC
I've applies this patch.  It conflicted with the other patch because they're
cumulative rather than incremental.  Better to close the original patch and
recreate a new one with all changes included.

I think I got it all sorted out but please give it a check.  Once again... sorry
for being so late in getting this committed.

There are failing tests in scratchpad so if you could have a look at those that
would be great.
Comment 5 Piers Taylor 2004-08-26 03:25:54 UTC
Thanks for applying the patch. I'll check it out asap. I'll take a look at the 
tests too.