Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24913 LlapTaskScheduler Improvements
  3. HIVE-24061

Improve llap task scheduling for better cache hit rate

    XMLWordPrintableJSON

Details

    Description

      TaskInfo is initialized with the "requestTime and locality delay". When lots of vertices are in the same level, "taskInfo" details would be available upfront. By the time, it gets to scheduling, "requestTime + localityDelay" won't be higher than current time. Due to this, it misses scheduling delay details and ends up choosing random node. This ends up missing cache hits and reads data from remote storage.

      E.g Observed this pattern in Q75 of tpcds.

      Related lines of interest in scheduler: https://github.com/apache/hive/blob/master/llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java

         boolean shouldDelayForLocality = request.shouldDelayForLocality(schedulerAttemptTime);
      ..
      ..
          boolean shouldDelayForLocality(long schedulerAttemptTime) {
            return localityDelayTimeout > schedulerAttemptTime;
          }
      

       

      Ideally, "localityDelayTimeout" should be adjusted based on it's first scheduling opportunity.

      Attachments

        Issue Links

          Activity

            People

              rajesh.balamohan Rajesh Balamohan
              rajesh.balamohan Rajesh Balamohan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m