Once the reservation management of
IMPALA-3200 is enabled, we should start modifying more operators to operate within a memory constraint, so that once the query starts the operators can run to completion.
This means that the bulk of the operator memory, particularly the memory that depends on the input size, should be allocated from the buffer pool. The operator should then be able to execute so long as it gets it's minimum reservation.
It is ok for now if expr memory and smaller amounts of memory like rowbatches and small decompression buffers that are streamed through the plan isn't included in this.
We also don't need to handle arbitrarily sized rows, only up to a configurable threshold (e.g. a query option)
Issues in epic
|IMPALA-3304||Aggregations do not always spill when exprs use large amount of memory||Open||Unassigned|
|IMPALA-3471||TopN should be able to spill||Open||Unassigned|
||IMPALA-4835||HDFS scans should operate with a constrained number of I/O buffers||Resolved||Tim Armstrong|
|IMPALA-4836||Kudu scans should operate within a memory constraint||Open||Unassigned|
|IMPALA-4837||Nested loop join should operate within a memory constraint||Open||Unassigned|
|IMPALA-4857||Handle large # of duplicate keys on build side of a spilling hash join||Open||Kurt Deschler|
||IMPALA-5519||Allocate fragment instance runtime filter memory from BufferPool||Resolved||Bikramjeet Vig|
|IMPALA-5707||HDFS table sinks should operate within a memory constraint||Open||Unassigned|
|IMPALA-5738||Make analytic window spillable||Open||Unassigned|
|IMPALA-5834||Handle spilling large number of NULLs on the build side of a null-aware anti join||Open||Unassigned|
|IMPALA-7220||Queries processing very large strings hit "memory limit exceeded" instead of spilling||Open||Unassigned|