Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
By default, tika extracts excel values as formatted in the sheet. It's a fine default.
However, many times, I am asked to extract raw values as the nicely formatted sheet for human eyes is losing precision.
In local instances, I've cloned the tika classes in order to do so, but it's messy due to how the code is layered (i wind up extending/copying 3-4 classes because chain of class construction).
I believe by adding a config option to the open office config class I can implement same option much more cleanly.
I plan to issue a pull request in few weeks (doing this contribute on the side based on professional use)
Attachments
Issue Links
- links to