Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.3.0
Description
Parquet dictionary decoders can accumulate throughout query execution. One is created per-column per-split. The decoder contains an vector of values for the dictionary that is not cleared when the scanner is finished with it.
I've attached a graph of memory usage when running this query on TPC-DS scale factor 100. Before is cdh5-trunk, and after is with a fix that delete the ColumnReader objects after each input split.
use tpcds_100_parquet; set num_scanner_threads=8; select * from store_sales where ss_sold_time_sk = 12512434;
Attachments
Attachments
Issue Links
- relates to
-
IMPALA-2885 Scanners store per-split objects in per-query object pool
- Resolved