Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3902

Multi-threaded query execution

    XMLWordPrintableJSON

Details

    • Epic
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • Impala 2.6.0
    • Impala 4.0.0
    • Backend
    • Impala Multi-threaded query execution

    Description

      Currently, a single query fragment is run in a quasi-single threaded manner on a node: the scanners are run in multiple threads, but all other operators (joins, aggregation) are run in the main thread.

      The goal is to add multi-threaded execution on a single node by running multiple fragment instances (each of which runs in a single thread).

      Attachments

        Issue Links

          1.
          Switch to per-query exec rpc Sub-task Resolved Marcel Kinard
          2.
          Factor out join build-side Sub-task Resolved Tim Armstrong
          3.
          Scheduler improvements for running multiple fragment instances on a single backend Sub-task Resolved Marcel Kinard
          4.
          Single-threaded scan node Sub-task Resolved Alexander Behm
          5.
          Generate codegen module without operator/query-specific constants Sub-task Resolved Michael Ho
          6.
          Introduce query-wide execution state. Sub-task Resolved Marcel Kinard
          7.
          Experimental flag for running all queries with mt_dop Sub-task Resolved Tim Armstrong
          8.
          Planner should disallow queries with mt_dop > 0 that are not executable. Sub-task Resolved Alexander Behm
          9.
          Standardize on MT-related data structures in the coordinator and scheduler Sub-task Resolved Marcel Kinard
          10.
          COMPUTE STATS on Parquet tables uses MT_DOP=4 by default Sub-task Resolved Alexander Behm
          11.
          Adjust maximum size of row batch queue for non-Parquet scans with MT_DOP>0. Sub-task Resolved Alexander Behm
          12.
          Single-threaded KuduScanNode Sub-task Resolved Joe McDonnell
          13.
          ReleaseResources() should not destroy control structures Sub-task Resolved Unassigned
          14.
          COMPUTE STATS uses MT_DOP=4 by default Sub-task Resolved Tim Armstrong
          15.
          Create tests for multi-threaded query execution Sub-task Resolved Unassigned
          16.
          Allow graceful fallback to mt_dop=0 Sub-task Resolved Tim Armstrong
          17.
          Admission control accounting for mt_dop Sub-task Resolved Tim Armstrong
          18.
          Use better algorithm for allocating scan ranges to finstances within a daemon in schedule Sub-task Resolved Tim Armstrong
          19.
          Parallelise all plans, including UNION, when mt_dop > 1 Sub-task Resolved Tim Armstrong
          20.
          Fix runtime filter bugs with mt_dop Sub-task Resolved Tim Armstrong
          21.
          Fix cancellation of RuntimeFilter::WaitForArrival() Sub-task Resolved Tim Armstrong
          22.
          Update planner decisions to factor in mt_dop Sub-task Resolved Tim Armstrong
          23.
          testMtDopValidationWithHDFSNumRowsEstDisabled toggles isTestEnv() changing table loading behaviour Sub-task Resolved Tim Armstrong
          24.
          Parallelise flush in data stream sender Sub-task Resolved Tim Armstrong
          25.
          Add general mechanism to find DataSink from other fragments Sub-task Resolved Tim Armstrong

          Activity

            People

              tarmstrong Tim Armstrong
              marcelk Marcel Kinard
              Votes:
              4 Vote for this issue
              Watchers:
              35 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: