Bug 44410 - Single letter data ranges (such as "C:C") do not 'expand' the same way as Excel expands them.
Summary: Single letter data ranges (such as "C:C") do not 'expand' the same way as Exc...
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.0-dev
Hardware: Other Windows XP
: P3 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-02-13 04:08 UTC by David Webster
Modified: 2008-05-20 03:51 UTC (History)
1 user (show)



Attachments
Excel file showing INDEX and SUM functions failing with single character data range (13.50 KB, application/vnd.ms-excel)
2008-02-13 04:12 UTC, David Webster
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Webster 2008-02-13 04:08:09 UTC
Hello All, 

Description
-----------
If a formula in Excel specifies a data range involving single letters only (i.e.
"C:C" meaning "all of column D"), when parsed in POI, this data range is not
expanded properly.  But, if the data range is specified as
LetterNumber:LetterNumber, all works fine with POI.

Steps to Reproduce
------------------
1/
Create a simple spreadsheet as follows
     A    B    C
   ------------------
1 |    |    | 1  |
2 |    |    | 2  |
3 |    |    | 3  |
4 | ** |    |    |
5 | %% |    |    |

** contains formula =INDEX(C:C,2,1)
%% contains formula =SUM(C:C)

2/
Notice in Excel (2003), cell A4 evaluates to 2 and Cell A5 evaluates to 6.

3/
Parse the Excel file using POI to evaulate cell formulas (i.e. using
HSSFFormulaEvaluator.CellValue cellValue = evaluator.evaluate(cell); for example)

4/
Output for cell A4 shows an exception, output for A5 shows 0.

Actual Results
--------------
As stated in 4/ above.
Exception is:

java.lang.ArrayIndexOutOfBoundsException: 1
        at
org.apache.poi.hssf.record.formula.functions.Index.evaluate(Index.java:83)
        at
org.apache.poi.hssf.record.formula.eval.FuncVarEval.evaluate(FuncVarEval.java:43)
        at
org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.internalEvaluate(HSSFFormulaEvaluator.java:395)
        at
org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator.evaluate(HSSFFormulaEvaluator.java:203)
        at ExcelParser2.getCellValue(ExcelParser2.java:71)
        at ExcelParser2.printWorkSheetTable(ExcelParser2.java:261)
        at ExcelParser2.main(ExcelParser2.java:293)


Expected Results
----------------
Same as Excel as stated in 2/ above.

Build Date & Platform
---------------------
2008-02-13 on Windows XP

Additional Information
----------------------

In HSSFFormulaEvaluator.java, method internalEvaluate(), it seems like when
evaluating "C:C", it is translated into "$C$1:$C$0".

For instance, 
Line 422 (for area and 2d area eval) and Line 439 (3d area eval) both show local
variable "row1" equalling -1 when evaulating, so the subsequent looping through
the rows and columns does not execute, hence the 'values' array is never filled.

Obviously different formulas (INDEX, SUM) handle this unexpected empty values
array differently ... 

The same exception occurs for the INDEX formula with ranges such as
"=INDEX(B:C)" (which works okay in Excel 2003).

At lines HSSFFormulaEvaluator.java:422 and HSSFFormulaEvaluator.java:439 I tried
adding the code: 

// 422:
// fix for evaluating range "D:D" which returns lastRow() of -1 instead of max row
if ( row1 == -1 )
{
  ap.setLastRow((short) (sheet.getLastRowNum()+1));
  row1 = ap.getLastRow();
}

// 439:
// fix for evaluating range "D:D" which returns lastRow() of -1 instead of max row
if ( row1 == -1 )
{
  a3dp.setLastRow((short) (sheet.getLastRowNum()+1));
  row1 = a3dp.getLastRow();
}

but that only made the INDEX() function work and the SUM() one fail with an
Exception - I clearly don't know conceptually where the change should go. As
mentioned above, using a fully specified data range such as "C1:C3" works fine
with both INDEX() and SUM() functions.

Anyways, any help or insights appreciated.

Much thanks as always,

Dave
Comment 1 David Webster 2008-02-13 04:12:22 UTC
Created attachment 21516 [details]
Excel file showing INDEX and SUM functions failing with single character data range
Comment 2 Nick Burch 2008-02-15 05:50:43 UTC
Fixed in SVN. Now not only does the formula evaluator handle these properly,
calling cell.getFormulaString() will also contain the correct formula