Bug 47309 - Number of Cell Comments in a sheet limited to 65536 with HSSF
Summary: Number of Cell Comments in a sheet limited to 65536 with HSSF
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.2-FINAL
Hardware: PC Windows XP
: P1 critical (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-06-03 06:38 UTC by Ken Adams
Modified: 2009-06-07 08:25 UTC (History)
0 users



Attachments
JUnit test case (3.79 KB, text/plain)
2009-06-03 06:39 UTC, Ken Adams
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ken Adams 2009-06-03 06:38:42 UTC
It seems to me that the number of cell comments that can be added to a spreadsheet using HSSF without messing up the ordering is limited to 65536 comments. 

* Create sheet with comments using HSSFComment
* write sheet to output stream
* read sheet with input stream
-> comments do not appear in correct row/col anymore

NOTE: MS-Excel can read the comments correctly!!


The HSSFCell.findComments() method hashes TextObjectRecords with their Short datatype IDs, which explains the number limit. Using Integer here (and when serializing/deserializing) messes up the format so that not even MS-Excel can read it.


Attachment:

provides JUnit test file that produces a spreadsheet with more than 65536 comments, writes to a file, reads the file in again and compares the order of comments between the two sheets.
Comments contain just row/col index numbers for comparison with cell content.
Comment 1 Ken Adams 2009-06-03 06:39:19 UTC
Created attachment 23748 [details]
JUnit test case
Comment 2 Yegor Kozlov 2009-06-07 08:25:50 UTC
Ken,

Thanks for the good bug report, you've done 50 percent of the job to fix the problem. I committed the fix in r782398. 

ShapeId is unsigned short and matching cells and comments by it works fine only if the number of comments is less than  65536.  The fact that Excel can handle sheets with greater numbers of comments made me think it uses a different logic. In fact, Excel uses a simple ordinal relationship: i-th NoteRecord corresponds to the i-th drawing group containing TextObjectRecord with the comment's text. This heuristics works fine for any number cell comments.

Regards,
Yegor