Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5223

Drill should ensure balanced workload assignment at node level in order to get better query performance

    XMLWordPrintableJSON

Details

    Description

      Drill's work assignment logic currently aims to achieve workload balance across different minor fragment (or slices) and honor data affinity in order to get as much local read as possible.

      However, when the # of work units could not be evenly divided by # of minor fragments, the remaining work units would tender to go to the first subset of drill endpoints. This means the drill endpoints assigned with the remaining work units could have larger workload than the rest of them. When MuxExchange is enabled (by default), all the minor fragments on the same node have to send data to a single Muxer per node, and unbalanced workload assignment at node level could impact query elapse time. which is essentially decided by the slowest drill endpoint.

      Some prototype experimental run shows that with more balanced workload assignment, Drill shows quite significant improvement for most of TPC-H queries.

      Attachments

        Issue Links

          Activity

            People

              ppenumarthy Padma Penumarthy
              jni Jinfeng Ni
              Jinfeng Ni Jinfeng Ni
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: