Created attachment 22397 [details] Sample .xls file I'm using the latest release of POI: poi-3.1-FINAL-20080629.jar I have attached a sample .xls file, source, and exception. Here is the sample code: public class theApp { public static void main(String[] Args) { try { POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream("test-data.xls")); HSSFWorkbook wb = new HSSFWorkbook(fs); } catch (Exception e) { e.printStackTrace(); } } } When attempting to open a .xls file I receive the following exception: org.apache.poi.hssf.record.RecordFormatException: Error reading bytes at org.apache.poi.hssf.record.RecordInputStream.nextRecord(RecordInputStream.java:115) at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:123) at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:246) at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:169) at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:151) at theApp.main(theApp.java:18) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:90) Caused by: org.apache.poi.util.LittleEndian$BufferUnderrunException: buffer underrun at org.apache.poi.util.LittleEndian.readFromStream(LittleEndian.java:482) at org.apache.poi.util.LittleEndian.readShort(LittleEndian.java:414) at org.apache.poi.hssf.record.RecordInputStream.nextRecord(RecordInputStream.java:113) ... 10 more
I can open this file in Excel without issue or apparent conversion. Once saved from Excel, this issue goes away.
Fixed in svn r683706. The example file has one extra byte of data beyond the EOFRecord. BTW - what application produced this file? POI always attempted to read the next record sid, without first checking for stream.available(). This was wrong, seemed to work because another bug in LittleEndian caused readShort() to return 0 when there were zero bytes available. All example spreadsheets up until now have had exactly zero bytes data after the EOFRecord. RecordInputStream was interpreting nextSid==0 as end of stream. This was also a little bit wrong, since 0x0000 *is* a valid Record sid (from a previous Excel version). RecordInputStream was changed to check the number of bytes left in the stream before reading the next sid. 'End of stream' condition is now represented by nextSid==-1 (a safer number). LittleEndian was modified to properly throw BufferUnderrunException even for zero bytes read. LittleEndian was also changed to avoid creating temporary byte arrays just to read bytes, shorts, ints and longs. A junit test case was added using the sample file provided.
(In reply to comment #2) > Fixed in svn r683706. > The example file has one extra byte of data beyond the EOFRecord. BTW - what > application produced this file? > POI always attempted to read the next record sid, without first checking for > stream.available(). This was wrong, seemed to work because another bug in > LittleEndian caused readShort() to return 0 when there were zero bytes > available. All example spreadsheets up until now have had exactly zero bytes > data after the EOFRecord. RecordInputStream was interpreting nextSid==0 as end > of stream. This was also a little bit wrong, since 0x0000 *is* a valid Record > sid (from a previous Excel version). > RecordInputStream was changed to check the number of bytes left in the stream > before reading the next sid. 'End of stream' condition is now represented by > nextSid==-1 (a safer number). LittleEndian was modified to properly throw > BufferUnderrunException even for zero bytes read. LittleEndian was also > changed to avoid creating temporary byte arrays just to read bytes, shorts, > ints and longs. > A junit test case was added using the sample file provided. (In reply to comment #2) > Fixed in svn r683706. > The example file has one extra byte of data beyond the EOFRecord. BTW - what > application produced this file? > POI always attempted to read the next record sid, without first checking for > stream.available(). This was wrong, seemed to work because another bug in > LittleEndian caused readShort() to return 0 when there were zero bytes > available. All example spreadsheets up until now have had exactly zero bytes > data after the EOFRecord. RecordInputStream was interpreting nextSid==0 as end > of stream. This was also a little bit wrong, since 0x0000 *is* a valid Record > sid (from a previous Excel version). > RecordInputStream was changed to check the number of bytes left in the stream > before reading the next sid. 'End of stream' condition is now represented by > nextSid==-1 (a safer number). LittleEndian was modified to properly throw > BufferUnderrunException even for zero bytes read. LittleEndian was also > changed to avoid creating temporary byte arrays just to read bytes, shorts, > ints and longs. > A junit test case was added using the sample file provided. This file was created by Business Objects XI Update 2. Can you tell me (roughly) when this resolution will be available in a FINAL build?
(In reply to comment #3) > Can you tell me (roughly) when this resolution will be available in a FINAL > build? I'm not sure of the exact timing for the next release, but it might be in about a month.