Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.9.0
Description
mmokhtar found a memory leak while inserting into Parquet files.
memz showed a lot of untracked memory (notice how the sum of the RequestPool peak memory doesn't add up to anywhere near the Process peak memory):
Process: Limit=100.00 GB Total=11.20 GB Peak=100.24 GB Free Disk IO Buffers: Total=609.44 MB Peak=1.76 GB RequestPool=fe-eval-exprs: Total=0 Peak=4.00 KB RequestPool=root.jenkins: Total=0 Peak=31.08 GB RequestPool=root.default: Total=0 Peak=2.05 GB RequestPool=root.mmokhtar: Total=1.85 GB Peak=2.30 GB Query(9341d70e5e64d792:420d626600000000): Limit=80.00 GB Total=1.85 GB Peak=2.04 GB Fragment 9341d70e5e64d792:420d626600000001: Total=1.83 GB Peak=2.04 GB SORT_NODE (id=1): Total=1.83 GB Peak=1.85 GB HDFS_SCAN_NODE (id=0): Total=0 Peak=594.56 MB HdfsTableSink: Total=2.94 MB Peak=3.06 MB CodeGen: Total=181.00 B Peak=290.00 KB Block Manager: Limit=64.00 GB Total=1.85 GB Peak=1.85 GB
I was able to get a heap growth profile from the live impalad (see https://cwiki.apache.org/confluence/display/IMPALA/Collecting+Impala+CPU+and+Heap+Profiles). I've attached the output of --pdf, which shows that DictEncoders are responsible for a lot of the heap growth.
This looks like the same bug as IMPALA-2940 except on the write path.
Attachments
Attachments
Issue Links
- is duplicated by
-
IMPALA-6597 Inert into partitioned runs out of memory
- Resolved
- is related to
-
IMPALA-6597 Inert into partitioned runs out of memory
- Resolved