Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 1.1
-
None
-
None
Description
The query: SELECT COUNT FROM 300gbTable is failing on the 17-node cluster when running with 500 concurrent clients (~30 clients per node). It seems to be related to SEQ/SNAP file format.
Attached is a heap growth profile. It can be seen that HdfsSequenceScanner::ReadCompressedBlock is using ~44GB of memory. SnappyBlockDecompressor::ProcessBlock is using ~32GB of the ~44GB.
14:21:48 Client Thread 148: Result: 14:21:48 -> Avg Time: None, Std Dev: None 14:21:48 14:21:52 Client Thread 178: <class 'tests.beeswax.impala_beeswax.ImpalaBeeswaxException'>: 14:21:52 Query aborted: 14:21:52 Backend 7:Memory limit exceeded 14:21:52 Format error in record or block header at offset: 7937326 14:21:52 First error while processing: hdfs://c2102.hal.cloudera.com:8020/user/impala/test-warehouse/300gb.main_seq_snap/000328_0 at offset: 8388608