Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7641

Convert Excel Reader to Use Streaming Reader

    XMLWordPrintableJSON

Details

    Description

      The current implementation of the Excel reader uses the Apache POI reader, which uses excessive amounts of memory. As a result, attempting to read large Excel files will cause out of memory errors. 

      This PR converts the format plugin to use a streaming reader, based still on the POI library.  The documentation for the streaming reader can be found here. [1]

      All unit tests pass and I tested the plugin with some large Excel files on my computer.

      [1]: https://github.com/pjfanning/excel-streaming-reader

       

      Attachments

        Issue Links

          Activity

            People

              cgivre Charles Givre
              cgivre Charles Givre
              Arina Ielchiieva Arina Ielchiieva
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: