Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3902

Multi-threaded query execution

    XMLWordPrintableJSON

    Details

    • Type: Epic
    • Status: Reopened
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: Impala 2.6.0
    • Fix Version/s: None
    • Component/s: Backend
    • Labels:
    • Epic Name:
      Impala Multi-threaded query execution

      Description

      Currently, a single query fragment is run in a quasi-single threaded manner on a node: the scanners are run in multiple threads, but all other operators (joins, aggregation) are run in the main thread.

      The goal is to add multi-threaded execution on a single node by running multiple fragment instances (each of which runs in a single thread).

        Attachments

          Issue Links

          1.
          Switch to per-query exec rpc Sub-task Resolved Marcel Kinard
          2.
          Factor out join build-side Sub-task Resolved Tim Armstrong
          3.
          Scheduler improvements for running multiple fragment instances on a single backend Sub-task Resolved Marcel Kinard
          4.
          Single-threaded scan node Sub-task Resolved Alexander Behm
          5.
          Generate codegen module without operator/query-specific constants Sub-task Resolved Michael Ho
          6.
          Introduce query-wide execution state. Sub-task Resolved Marcel Kinard
          7.
          Experimental flag for running all queries with mt_dop Sub-task Resolved Tim Armstrong
          8.
          Planner should disallow queries with mt_dop > 0 that are not executable. Sub-task Resolved Alexander Behm
          9.
          Standardize on MT-related data structures in the coordinator and scheduler Sub-task Resolved Marcel Kinard
          10.
          COMPUTE STATS on Parquet tables uses MT_DOP=4 by default Sub-task Resolved Alexander Behm
          11.
          Adjust maximum size of row batch queue for non-Parquet scans with MT_DOP>0. Sub-task Resolved Alexander Behm
          12.
          Single-threaded KuduScanNode Sub-task Resolved Joe McDonnell
          13.
          ReleaseResources() should not destroy control structures Sub-task Resolved Unassigned
          14.
          COMPUTE STATS uses MT_DOP=4 by default Sub-task Resolved Tim Armstrong
          15.
          Create tests for multi-threaded query execution Sub-task Resolved Unassigned
          16.
          Allow graceful fallback to mt_dop=0 Sub-task Resolved Tim Armstrong
          17.
          Admission control accounting for mt_dop Sub-task Resolved Tim Armstrong
          18.
          Use better algorithm for allocating scan ranges to finstances within a daemon in schedule Sub-task Resolved Tim Armstrong
          19.
          Parallelise all plans, including UNION, when mt_dop > 1 Sub-task Resolved Tim Armstrong
          20.
          MT Scanners do not check runtime filters per-file before processing each split Sub-task Resolved Unassigned
          21.
          Fix runtime filter bugs with mt_dop Sub-task Resolved Tim Armstrong
          22.
          Fix cancellation of RuntimeFilter::WaitForArrival() Sub-task Resolved Tim Armstrong
          23.
          Update planner decisions to factor in mt_dop Sub-task Resolved Tim Armstrong
          24.
          testMtDopValidationWithHDFSNumRowsEstDisabled toggles isTestEnv() changing table loading behaviour Sub-task Resolved Tim Armstrong
          25.
          Parallelise flush in data stream sender Sub-task Resolved Tim Armstrong
          26.
          Add general mechanism to find DataSink from other fragments Sub-task Resolved Tim Armstrong

            Activity

              People

              • Assignee:
                tarmstrong Tim Armstrong
                Reporter:
                marcelk Marcel Kinard
              • Votes:
                4 Vote for this issue
                Watchers:
                30 Start watching this issue

                Dates

                • Created:
                  Updated: