-
Type:
Improvement
-
Status: Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: None
-
Component/s: Backend
-
Labels:
-
Epic Color:ghx-label-8
We currently use one thread to scan each tablet, which may underparallelise queries in many cases. Kudu added an API in KUDU-2437 and KUDU-2670 to split tokens at a finer granularity.
The major downside is that the planner has to do an extra RPC to a tserver for each tablet being scanned in order to figure out key range splits. Maybe we can tie this to mt_dop >= 2, or use some heuristics to avoid these RPCs for smaller tables.
- causes
-
IMPALA-10245 Test fails in TestKuduReadTokenSplit.test_kudu_scanner
-
- Resolved
-
- is related to
-
KUDU-2670 Splitting more tasks for spark job, and add more concurrent for scan operation
-
- Open
-
-
KUDU-2437 Split a tablet into primary key ranges by size
-
- Resolved
-
- relates to
-
IMPALA-9656 Dynamic intra-node load balancing for Kudu (and maybe HBase) scans.
-
- Open
-