Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.8.0
Description
Now that IMPALA-3567 is solved, the next step is to add the plumbing to have a join builder as the sink of a plan fragment to implement the parallel plans added in http://gerrit.cloudera.org:8080/2846
This JIRA tracks making the plans executable, without sharing of the join build for broadcast join.
Steps required:
- Enable the join build sink in the planner
- Update planner to include all required state in the thrift objects (the join build sinks are missing various required info).
- Update planner resource requirement calculations - join build fragment needs real resource estimates
- Update scheduler to schedule join build fragment co-located with their parent fragment. This depends on the build plans being sent pre-order. Pass the source fragment instance id into the join nodes so they can locate the input fragment instance.
- Update scheduler to correctly handle multiple build plans.
- Instantiate the join builders as input sinks to the plan. This requires getting some data from the thrift structs instead of passed in from the PHJNode
- Ensure the join builders function correctly as plan sinks (e.g. add an indefinite wait to the join node to prevent it from crashing, ensure that the builder consumes the whole input). Initially we probably wait to have the build thread block in Close().
- Update the join node so that in the non-subplan mt_dop > 0 case, it looks up the input fragment instance and waits for it to finish the build (with cancellation). Need to find all the places it looks for the right child.
- After that the join node "owns" the builder so the control flow should be the same mostly. The main difference is that the buffer pool client and memory tracking is set up differently. Maybe need to change the Close() call as well?
- Figure out any resource management, etc, issues across the build and probe (threads, memory, etc). Fix up the builder thread behaviour so that Close() doesn't block and the thread is released.
This, I think, needs to be one change because the intermediate states aren't testable or functional.
Testing:
- Existing mt join tests are useful and will exercise the new behaviour
- Ensure spilling is tested with multithreading (new dimension to spilling tests?)
- Ensure cancellation is tested.
Attachments
Issue Links
- breaks
-
IMPALA-9737 DCHECK in buffer-pool.cc - min_bytes_to_write <= dirty_unpinned_pages_.bytes()
- Resolved
- depends upon
-
IMPALA-9125 Add general mechanism to find DataSink from other fragments
- Resolved
-
IMPALA-9126 Cleanly separate build and probe state in hash join node
- Resolved
-
IMPALA-9127 Clean up probe-side state machine in hash join
- Resolved
- is depended upon by
-
IMPALA-9156 Share broadcast join builds between fragments
- Resolved