Bug 52575 - [PATCH] Ignore missing workbook references
[PATCH] Ignore missing workbook references
Status: RESOLVED FIXED
Product: POI
Classification: Unclassified
Component: HSSF
3.7
PC All
: P2 normal (vote)
: ---
Assigned To: POI Developers List
: PatchAvailable
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2012-02-01 22:45 UTC by SunGard Global Services Germany
Modified: 2012-02-07 08:24 UTC (History)
0 users



Attachments
ignoreMissingWorkbooks.patch - Patch to implement this feature (4.72 KB, application/octet-stream)
2012-02-01 22:45 UTC, SunGard Global Services Germany
Details
IgnoreWorkbooks with Testcases (10.57 KB, patch)
2012-02-05 22:48 UTC, SunGard Global Services Germany
Details | Diff
ZIP of the XLS (2003) sheets to execute the testcases (7.38 KB, application/octet-stream)
2012-02-05 22:51 UTC, SunGard Global Services Germany
Details

Note You need to log in before you can comment on or make changes to this bug.
Description SunGard Global Services Germany 2012-02-01 22:45:47 UTC
Created attachment 28248 [details]
ignoreMissingWorkbooks.patch - Patch to implement this feature

In our project we have one XLS file referencing > 10 other XLS files.
Setting up a workbook environment works as expected and a formular recalculation works.

In our project, we sometimes do not have all referenced XLS files but want to recalculate those values for which we have the referenced XLS files.
For those situations we are absolutly fine with the cached values which exists already in the main XLS.

This might be a quite dedicated setup: Therefore I introduced a system property to change the behavoir:
The old behavoir is to fail as before.
Setting the property to true issues an INFO log message and continues with the cached value.


Currently I'm a little bit unhappy about the brute-force approach to get the cell type to access the cached value - but I didn't found another way to handle it.


Please let me know your thoughts - Feel free to take this patch as an insipration for a feature request and implement it in a more POI-standard way.

I have not signed any kind of NDA from Microsoft and used only the public availible documentation about the XLS file format.
Comment 1 Yegor Kozlov 2012-02-03 08:46:48 UTC
Thanks for the patch. There two things to do before check it in svn:

1. It lacks a unit test(s). Can you write sample code that demonstrates that the code throws  WorkbookNotFoundException if IGNORE_MISSING_WORKBOOKS=false and uses cached formula values otherwise. If the test requires input workbooks then attached them along with the test.

2. What about loading IGNORE_MISSING_WORKBOOKS from a poi.properties file? First lookup poi.properties in the classpath and then load default config bundled in the jar. 
My concern is that in many environments users cannot change JVM system properties, for example, think of a web application that cannot change JVM settings of the server (Tomcat or whatever). With poi.properties, you can simply put in the classpath (WEB-INF/classes) and POI will initialize from it  -  pretty much like log4j searches its config file. 

Regards,
Yegor
Comment 2 SunGard Global Services Germany 2012-02-03 11:50:48 UTC
Thanks for the feedback Yegor.

Providing a test with workbook is quite simple.

RE 2.: I tried to keep the impact as low as possible.
Is there some poi.properties infrastructure? I didn't noticed something like this.
Comment 3 Yegor Kozlov 2012-02-03 11:54:55 UTC
(In reply to comment #2)
> Thanks for the feedback Yegor.
> 
> Providing a test with workbook is quite simple.
> 
> RE 2.: I tried to keep the impact as low as possible.
> Is there some poi.properties infrastructure? I didn't noticed something like
> this.

No, for now poi.properties exists only in my dreams :)

Yegor
Comment 4 Nick Burch 2012-02-03 12:13:56 UTC
On the settings thing, might it be worth copying what ExtractorFactory does for the usermodel vs event streaming setting?
Comment 5 SunGard Global Services Germany 2012-02-05 22:48:58 UTC
Created attachment 28271 [details]
IgnoreWorkbooks with Testcases

Attached a new patch containing the original one (with some simplifications) and testcases.
The attached XLS are also required for the testcase execution.

I would like to keep track of the "configuration infrastructure" in a different ticket.
I will post a patch within another bug.
Comment 6 SunGard Global Services Germany 2012-02-05 22:51:03 UTC
Created attachment 28272 [details]
ZIP of the XLS (2003) sheets to execute the testcases
Comment 7 Yegor Kozlov 2012-02-06 07:49:42 UTC
Applied in r1240903 with minor tweaks:
 - you forgot to implement getCachedFormulaResultType() in ForkedEvaluationCell
 - I changed the unit test to utilize POI infrastructure to read test files: instead of direct reading from FileInputStream it is recommended to use HSSFTestDataSamples.openSampleWorkbook (filename)
 - added a prefix to the test files and put them in ./test-data/spreadsheet. I would like to keep all spreadsheet samples in the same directory and avoid extra sub-folders. 

Regards,
Yegor

(In reply to comment #4)
> On the settings thing, might it be worth copying what ExtractorFactory does for
> the usermodel vs event streaming setting?

(In reply to comment #6)
> Created attachment 28272 [details]
> ZIP of the XLS (2003) sheets to execute the testcases
Comment 8 Yegor Kozlov 2012-02-06 07:59:09 UTC
I think that the simplest (and the best) solution would be to add a setter for this property right in the formula evaluator. The client code would look as follows:

FormulaEvaluator ev = workbook.getCreationHelper().createFormulaEvaluator();
ev.setIgnoreMissingWorkbooks(true);

This way this setting can be changed in a very flexible way. 

Yegor

(In reply to comment #4)
> On the settings thing, might it be worth copying what ExtractorFactory does for
> the usermodel vs event streaming setting?
Comment 9 Yegor Kozlov 2012-02-07 08:10:28 UTC
I removed configuring IGNORE_MISSING_WORKBOOKS from a system property and added a public setter HSSFFormulaEvaluator.setIgnoreMissingWorkbooks(boolean ignore).

Client code:

HSSFFormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator();
evaluator.setIgnoreMissingWorkbooks(true); 

For now setIgnoreMissingWorkbooks is in HSSF only because evaluation across multiple workbooks is not yet implemented in XSSF. 

Yegor
Comment 10 SunGard Global Services Germany 2012-02-07 08:24:29 UTC
(In reply to comment #9)
> I removed configuring IGNORE_MISSING_WORKBOOKS from a system property and added
> a public setter HSSFFormulaEvaluator.setIgnoreMissingWorkbooks(boolean ignore).
Ok, this makes my patch submission unnecessary. 
Thanks :)