Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-23744

Reduce query startup latency

    XMLWordPrintableJSON

    Details

    • Type: Task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 4.0.0
    • Fix Version/s: None
    • Component/s: llap
    • Labels:
      None

      Description

      When I run queries with large number of tasks for a single vertex, I see a significant delay before all tasks start execution in llap daemons. 

      Although llap daemons have the free capacity to run the tasks, it takes a significant time to schedule all the tasks in AM and actually transmit them to executors.

      "am_schedule_and_transmit" shows scheduling of tasks of tpcds query 55. It shows only the tasks scheduled for one of 10 llap daemons. The scheduler works in a single thread, scheduling tasks one by one. A delay in scheduling of one task, delays all the tasks.

       

      Another issue is that it takes long time to fill all the execution slots in llap daemons even though they are all empty initially. This is caused by LlapTaskCommunicator using a fixed number of threads (10 by default) to send the tasks to daemons. Also this communication is synchronized so these threads block communication staying idle. "task_start.png" shows running tasks on an llap daemon that has 12 execution slots. By the time 12th task starts running, more than 100ms already passes. That slot stays idle all this time. 

        Attachments

        1. task_start.png
          212 kB
          Mustafa İman
        2. am_schedule_and_transmit.png
          162 kB
          Mustafa İman

          Activity

            People

            • Assignee:
              mustafaiman Mustafa İman
              Reporter:
              mustafaiman Mustafa İman
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 50m
                1h 50m