Hello All, Description ----------- If a formula in Excel specifies a data range involving single letters only (i.e. "C:C" meaning "all of column D"), when parsed in POI, this data range is not expanded properly. But, if the data range is specified as LetterNumber:LetterNumber, all works fine with POI. Steps to Reproduce ------------------ 1/ Create a simple spreadsheet as follows A B C ------------------ 1 | | | 1 | 2 | | | 2 | 3 | | | 3 | 4 | ** | | | 5 | %% | | | ** contains formula =INDEX(C:C,2,1) %% contains formula =SUM(C:C) 2/ Notice in Excel (2003), cell A4 evaluates to 2 and Cell A5 evaluates to 6. 3/ Parse the Excel file using POI to evaulate cell formulas (i.e. using HSSFFormulaEvaluator.CellValue cellValue = evaluator.evaluate(cell); for example) 4/ Output for cell A4 shows an exception, output for A5 shows 0. Actual Results -------------- As stated in 4/ above. Exception is: java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.poi.hssf.record.formula.functions.Index.evaluate(Index.java:83) at org.apache.poi.hssf.record.formula.eval.FuncVarEval.evaluate(FuncVarEval.java:43) at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.internalEvaluate(HSSFFormulaEvaluator.java:395) at org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluate(HSSFFormulaEvaluator.java:203) at ExcelParser2.getCellValue(ExcelParser2.java:71) at ExcelParser2.printWorkSheetTable(ExcelParser2.java:261) at ExcelParser2.main(ExcelParser2.java:293) Expected Results ---------------- Same as Excel as stated in 2/ above. Build Date & Platform --------------------- 2008-02-13 on Windows XP Additional Information ---------------------- In HSSFFormulaEvaluator.java, method internalEvaluate(), it seems like when evaluating "C:C", it is translated into "$C$1:$C$0". For instance, Line 422 (for area and 2d area eval) and Line 439 (3d area eval) both show local variable "row1" equalling -1 when evaulating, so the subsequent looping through the rows and columns does not execute, hence the 'values' array is never filled. Obviously different formulas (INDEX, SUM) handle this unexpected empty values array differently ... The same exception occurs for the INDEX formula with ranges such as "=INDEX(B:C)" (which works okay in Excel 2003). At lines HSSFFormulaEvaluator.java:422 and HSSFFormulaEvaluator.java:439 I tried adding the code: // 422: // fix for evaluating range "D:D" which returns lastRow() of -1 instead of max row if ( row1 == -1 ) { ap.setLastRow((short) (sheet.getLastRowNum()+1)); row1 = ap.getLastRow(); } // 439: // fix for evaluating range "D:D" which returns lastRow() of -1 instead of max row if ( row1 == -1 ) { a3dp.setLastRow((short) (sheet.getLastRowNum()+1)); row1 = a3dp.getLastRow(); } but that only made the INDEX() function work and the SUM() one fail with an Exception - I clearly don't know conceptually where the change should go. As mentioned above, using a fully specified data range such as "C1:C3" works fine with both INDEX() and SUM() functions. Anyways, any help or insights appreciated. Much thanks as always, Dave
Created attachment 21516 [details] Excel file showing INDEX and SUM functions failing with single character data range
Fixed in SVN. Now not only does the formula evaluator handle these properly, calling cell.getFormulaString() will also contain the correct formula