Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
ParquetRecordReaderWrapper is reading the file footer to create the splits, but then when calling the realReader.initialize(), the file footer is read again by parquet.
The issue PARQUET-139 did work to avoid reading the footers in parquet-avro. We should implement the same idea in Hive, and update the parquet library to the latest stable version from upstream.