[DRILL-6115] SingleMergeExchange is not scaling up when many minor fragments are allocated for a query. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.12.0
Fix Version/s: 1.13.0
Component/s: Execution - Relational Operators
Labels:
- ready-to-commit

Description

SingleMergeExchange is created when a global order is required in the output. The following query produces the SingleMergeExchange.

0: jdbc:drill:zk=local> explain plan for select L_LINENUMBER from dfs.`/drill/tables/lineitem` order by L_LINENUMBER;
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(L_LINENUMBER=[$0])
00-02 SingleMergeExchange(sort0=[0])
01-01 SelectionVectorRemover
01-02 Sort(sort0=[$0], dir0=[ASC])
01-03 HashToRandomExchange(dist0=[[$0]])
02-01 Scan(table=[[dfs, /drill/tables/lineitem]], groupscan=[JsonTableGroupScan [ScanSpec=JsonScanSpec [tableName=maprfs:///drill/tables/lineitem, condition=null], columns=[`L_LINENUMBER`], maxwidth=15]])

On a 10 node cluster if the table is huge then DRILL can spawn many minor fragments which are all merged on a single node with one merge receiver. Doing so will create lot of memory pressure on the receiver node and also execution bottleneck. To address this issue, merge receiver should be multiphase merge receiver.

Ideally for large cluster one can introduce tree merges so that merging can be done parallel. But as a first step I think it is better to use the existing infrastructure for multiplexing operators to generate an OrderedMux so that all the minor fragments pertaining to one DRILLBIT should be merged and the merged data can be sent across to the receiver operator.

On a 10 node cluster if each node processes 14 minor fragments.

Current version of code merges 140 minor fragments
the proposed version has two level merges 1 - 14 merge in each drillbit which is parallel
and 10 minorfragments are merged at the receiver node.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

Enhancing Drill to multiplex ordered merge exchanges.docx
29/Jan/18 19:47
9 kB
Hanumath Rao Maduri

Issue Links

links to

GitHub Pull Request #1110

Activity

People

Assignee:: Hanumath Rao Maduri

Reporter:: Hanumath Rao Maduri

Reviewer:: Vlad Rozov

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 29/Jan/18 19:47

Updated:: 24/Feb/18 01:21

Resolved:: 24/Feb/18 01:21