Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4475

Compress ExecPlanFragment before shipping it to worker nodes to reduce network traffic

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: Impala 2.6.0
    • Fix Version/s: None
    • Component/s: Distributed Exec

      Description

      Sending the ExecPlanFragment to remote nodes dominates the query startup time on clusters larger than 100 nodes, size of the ExecPlanFragment grows with number of tables, blocks and partitions in the table.

      On large cluster this is limits query throughput.

      From TPC-DS Q11 on 1K node cluster

          Query Timeline: 5m6s
             - Query submitted: 75.256us (75.256us)
             - Planning finished: 1s580ms (1s580ms)
             - Submit for admission: 2s376ms (795.652ms)
             - Completed admission: 2s377ms (1.512ms)
             - Ready to start 15993 fragment instances: 2s458ms (80.378ms)
             - First dynamic filter received: 2m35s (2m33s)
             - All 15993 fragment instances started: 2m35s (40.934ms)
             - Rows available: 4m53s (2m17s)
             - First row fetched: 4m53s (176.254ms)
             - Unregister query: 4m58s (4s828ms)
           - ComputeScanRangeAssignmentTimer: 600.086ms
      

        Attachments

        1. count_store_returns.txt.zip
          753 kB
          Mostafa Mokhtar
        2. slow_query_start_250K_partitions_134nodes.txt
          714 kB
          Mostafa Mokhtar

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                mmokhtar Mostafa Mokhtar
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated: