Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 3.4.0
-
None
-
ghx-label-4
Description
While working on IMPALA-9856, I found that the following DCHECK in SpillableRowBatchQueue::AddBatch consistently hit when result spooling is enabled and row size is larger than resource_profile_.max_reservation, causing impalad to crash.
https://github.com/apache/impala/blob/eea617b/be/src/runtime/spillable-row-batch-queue.cc#L97
We can reproduce this issue by adding the following query options in
query_test/test_insert.py::TestInsertQueries::test_insert_large_string
self.client.set_configuration_option("spool_query_results", "1") self.client.set_configuration_option("max_row_size", "257mb")
Additionally, setting max_result_spooling_mem to 512MB will increase
resource_profile_.max_reservation to fit the large row and avoid this DCHECK.
Instead of DCHECK, I think impalad should return error status, suggesting that user need to set larger max_result_spooling_mem.
Another solution is to also consider max_row_size when computing maxMemReservationBytes in PlanRootSink.java.
https://github.com/apache/impala/blob/eea617b/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java#L74
Attachments
Issue Links
- is related to
-
IMPALA-9856 Enable result spooling by default
- Resolved