Affects Version/s: None
Fix Version/s: None
I have an example ppt that embeds a chart, but Tika mis-identifies it
as an XLS document.
The progID (oleShape.getProgID() in
HSLFExtractor.handleSlideEmbeddedResources) is MSGraph.Chart.8 ... and
we seem to detect it as Excel (application/vnd.ms-excel) but then the
ExcelExtractor hits this exception:
Since DelegatingParser silently suppresses all exceptions, when you
run TikaCLI you won't see any exception nor text extracted, but if you
run with -z, it will save 1.xls which if you then try to parse with
TikaCLI hits the above exception.