Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-1522

Scheduling can result in out of order execution and slowdown of upstream work

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Later
    • None
    • None
    • None

    Description

      M2 M7
      \ /
      (sg) \ /
      R3 / (b)
      \ /
      (b) \ /
      \ /
      M5

      R6

      Plz refer to the attachment (task runtime SVG). In this case, M5 got scheduled much earlier than R3 (green color in the diagram) and retained lots of containers.
      R3 got less containers to work with.

      Attaching the output from the status monitor when the job ran; Map_5 has taken up almost all of cluster resource, whereas Reducer_3 got fraction of the capacity.

      Map_2: 1/1 Map_5: 0(+373)/1000 Map_7: 1/1 Reducer_3: 0/8000 Reducer_6: 0/1
      Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: 0/8000 Reducer_6: 0/1
      Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: 0(+1)/8000 Reducer_6: 0/1
      ....
      Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: 14(+7)/8000 Reducer_6: 0/1
      Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: 63(+14)/8000 Reducer_6: 0/1
      Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: 159(+22)/8000 Reducer_6: 0/1
      Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: 308(+29)/8000 Reducer_6: 0/1
      ...

      Creating this JIRA as a placeholder for scheduler enhancement. One possibililty could be to
      schedule lesser number of tasks in downstream vertices, based on the information available for the upstream vertex.

      Attachments

        1. TEZ-1522.am.log.gz
          4.63 MB
          Rajesh Balamohan
        2. TEZ-1522.2.wip.txt
          20 kB
          Siddharth Seth
        3. TEZ-1522.1.wip.txt
          20 kB
          Siddharth Seth
        4. task_runtime.svg
          3.72 MB
          Rajesh Balamohan

        Activity

          People

            rajesh.balamohan Rajesh Balamohan
            rajesh.balamohan Rajesh Balamohan
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: