Description
Tika's existing ExcelParser implementation uses POI's HSSFWorkbook to extract text from an Excel file. POI also provides an alternative "Event API"[1] for processing Excel files - the advantage being that it has a much smaller memory footprint, but at the cost of a slightly more complex API.
I have written an alternative excel parser implementation based on the Event API - if its of interest to the Tika project I'll write a test case for it.