Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.0
Description
Impala's memory consumption is very high when it writes to Parquet and there is a large number of partitions, primarily because we try to buffer data per partition. That however can lead to OOM, see attached profile. Instead we can either spill the buffered data to disk or write to Parquet files.
Attachments
Attachments
Issue Links
- duplicates
-
IMPALA-2522 Improve the reliability and effectiveness of ETL
- Resolved