Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: Impala 2.6.0
    • Fix Version/s: None
    • Component/s: Backend
    • Labels:
      None

      Description

      Currently, a single query fragment is run in a quasi-single threaded manner on a node: the scanners are run in multiple threads, but all other operators (joins, aggregation) are run in the main thread.

      The goal is to add multi-threaded execution on a single node by running multiple fragment instances (each of which runs in a single thread).

        Attachments

          Issue Links

          1.
          Switch to per-query exec rpc Sub-task Resolved Marcel Kornacker
          2.
          Factor out join build-side Sub-task Resolved Tim Armstrong
          3.
          Scheduler improvements for running multiple fragment instances on a single backend Sub-task Resolved Marcel Kornacker
          4.
          Single-threaded scan node Sub-task Resolved Alexander Behm
          5.
          Generate codegen module without operator/query-specific constants Sub-task Resolved Michael Ho
          6.
          Introduce query-wide execution state. Sub-task Resolved Marcel Kornacker
          7.
          Experimental flag for running all queries with mt_dop Sub-task Resolved Tim Armstrong
          8.
          Share codegen work between fragment instances Sub-task Open Michael Ho
          9.
          Add backend support for join build sinks in parallel plans Sub-task Open Unassigned
          10.
          Create tests for multi-threaded query execution Sub-task Open Unassigned
          11.
          Planner should disallow queries with mt_dop > 0 that are not executable. Sub-task Resolved Alexander Behm
          12.
          Standardize on MT-related data structures in the coordinator and scheduler Sub-task Resolved Marcel Kornacker
          13.
          Aggregate runtime filters locally Sub-task Open Michael Ho
          14.
          COMPUTE STATS on Parquet tables uses MT_DOP=4 by default Sub-task Resolved Alexander Behm
          15.
          Adjust maximum size of row batch queue for non-Parquet scans with MT_DOP>0. Sub-task Resolved Alexander Behm
          16.
          Single-threaded KuduScanNode Sub-task Resolved Joe McDonnell
          17.
          ReleaseResources() should not destroy control structures Sub-task Resolved Unassigned
          18.
          COMPUTE STATS uses MT_DOP=4 by default Sub-task Open Unassigned
          19.
          Simplify ownership of FilterContexts and MemPools in ScannerContext once non-MT scan node is removed Sub-task Open Unassigned
          20.
          MT Scanners do not check runtime filters per-file before processing each split Sub-task Open Unassigned
          21.
          Limit number of files generated by unpartitioned insert Sub-task Open Unassigned

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                marcelk Marcel Kornacker
              • Votes:
                4 Vote for this issue
                Watchers:
                27 Start watching this issue

                Dates

                • Created:
                  Updated: