Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1803

Avoid hitting OOM in HdfsTableSink when inserting to Parquet

    Details

      Description

      Impala's memory consumption is very high when it writes to Parquet and there is a large number of partitions, primarily because we try to buffer data per partition. That however can lead to OOM, see attached profile. Instead we can either spill the buffered data to disk or write to Parquet files.

        Attachments

        1. hdfstablesink-oom.txt
          29 kB
          Ippokratis Pandis

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                ippokratis Ippokratis Pandis
              • Votes:
                2 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: