Bug 53984 - RuntimeException: Unexpected record type (org.apache.poi.hssf.record.ColumnInfoRecord)
Summary: RuntimeException: Unexpected record type (org.apache.poi.hssf.record.ColumnIn...
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.8-FINAL
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks: 52447 57669
  Show dependency tree
 
Reported: 2012-10-09 10:18 UTC by Triqui
Modified: 2015-05-01 20:49 UTC (History)
0 users



Attachments
Test file throwing the reported exception when openend (37.43 KB, application/vnd.ms-excel)
2013-01-22 11:43 UTC, Triqui
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Triqui 2012-10-09 10:18:16 UTC
I keep getting this exception in an application which parses excel files sent to us.

It's because the file has one or more ColumnInfoRecord right after a row block. Like this:
Offset=0x00008772(34674) recno=1873 sid=0x00FD size=0x000A(10)
[LABELSST]
    .row    = 0x00D2
    .col    = 0x0006
    .xfindex= 0x002A
  .sstIndex = 0x001D
[/LABELSST]

Offset=0x00008780(34688) recno=1874 sid=0x0201 size=0x0006(6)
[BLANK]
    row= 0x00D2
    col= 0x0007
    xf = 0x002A
[/BLANK]

Offset=0x0000878A(34698) recno=1875 sid=0x0201 size=0x0006(6)
[BLANK]

... several empty cells in row 0xD3

[/BLANK]


Offset=0x000087D0(34768) recno=1882 sid=0x007D size=0x000B(11)
[COLINFO]
  colfirst = 0
  collast  = 0
  colwidth = 10240
  xfindex  = 0
  options  = 0x0000
    hidden   = false
    olevel   = 0
    collapsed= false
[/COLINFO]


Several versions of Excel and LibreOffice 3.5.4.2 open these files without problem.
I don't know how they are generated, but I wonder if the record order described in the OOO excelfileformat.pdf is strict or it is allowed for a ColumnInfoRecord to appear after a row block.
Also I would like to know if the solution is as simple as adding ColumnInfoRecord to RecordOrderer.isEndOfRowBlock which I guess was the solution to a very similar bug (bug 50426)

Thank you very much.
Comment 1 Yegor Kozlov 2012-10-09 11:41:35 UTC
Can you attache file ? 

Yegor
Comment 2 Triqui 2012-10-09 14:34:05 UTC
Well, in fact, I can't right now, since I cannot share the contents of the files, and modifying and saving the files fixes the issues.

But...
I attached the relevant part of the BiffViewer output for one of the files
And after debugging the execution I can explain what happens exactly.

Having COLINFO record right after the row block means that ColumnInfoRecords are included in the row block and when processing them the  RowRecordsAggregate(RecordStream rs, SharedValueManager svm) constructor throws an exception when checking the type of record:

	if (!(rec instanceof CellValueRecordInterface)) { // TRUE for ColumnInfoRecord
		throw new RuntimeException("Unexpected record type (" + rec.getClass().getName() + ")");
	}

Oops sorry, I just realized I didn't attach the exception stack, i was going to paste it after the bug title, but forgot to do it. Here it is:
java.lang.RuntimeException: Unexpected record type (org.apache.poi.hssf.record.ColumnInfoRecord)
	at org.apache.poi.hssf.record.aggregates.RowRecordsAggregate.<init>(RowRecordsAggregate.java:107) ~[poi-3.8.jar:3.8]
	at org.apache.poi.hssf.model.InternalSheet.<init>(InternalSheet.java:208) ~[poi-3.8.jar:3.8]
	at org.apache.poi.hssf.model.InternalSheet.createSheet(InternalSheet.java:163) ~[poi-3.8.jar:3.8]
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:296) ~[poi-3.8.jar:3.8]
	at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:49) ~[poi-ooxml-3.8.jar:3.8]


I have fixed this modifying the RecordOrderer.isEndOfRowBlock method, but don't know if that's the right thing to do:
	public static boolean isEndOfRowBlock(int sid) {
		switch(sid) {
			case ViewDefinitionRecord.sid:
				// should have been prefixed with DrawingRecord (0x00EC), but bug 46280 seems to allow this
			case DrawingRecord.sid:
			case DrawingSelectionRecord.sid:
			case ObjRecord.sid:
			case TextObjectRecord.sid:

            case GutsRecord.sid:   // see Bug 50426
            case ColumnInfoRecord.sid:   // see Bug 53984
			case WindowOneRecord.sid:
				// should really be part of workbook stream, but some apps seem to put this before WINDOW2
			case WindowTwoRecord.sid:
				return true;

			case DVALRecord.sid:
				return true;
			case EOFRecord.sid:
				// WINDOW2 should always be present, so shouldn't have got this far
				throw new RuntimeException("Found EOFRecord before WindowTwoRecord was encountered");
		}
		return PageSettingsBlock.isComponentRecord(sid);
	}

Please, let me know if this info is enough or not.
In the meantime I will try to get one of those files without any sensible information so I can attach it here.

Thanks.
Comment 3 Yegor Kozlov 2012-10-09 15:27:16 UTC
the fix looks sane but I'd rather not commit it without a unit test. BiffViewer dump is not enough, we need a file.

Yegor
Comment 4 Yegor Kozlov 2012-10-26 11:54:30 UTC
changing status to NEEDINFO until a test file is provided
Comment 5 Triqui 2013-01-22 11:43:51 UTC
Created attachment 29880 [details]
Test file throwing the reported exception when openend
Comment 6 Triqui 2013-01-22 11:44:57 UTC
After all this time, at last I have a file that I can share with you.

The problem is that even though you said the fix proposed look sane, it's not perfect, since I found another file which was opening fine that stopped working when I added the workaround suggested, so we have to find another solution for this. Let me know if I can be of any help with it.
Comment 7 Nick Burch 2014-07-31 13:32:19 UTC
Fixed in r1614884.

I'm fairly sure that the file in question wasn't generated by Excel, as it does some very very odd things. We do now handle the ColumnInfo coming at the end not the start of the sheet, and we also warn + skip over sheets where the BOFRecord type isn't one we support (this file has a totally invalid one at the end)