Hadoop YARN
  1. Hadoop YARN
  2. YARN-110

AM releases too many containers due to the protocol

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: resourcemanager, scheduler
    • Labels:
      None

      Description

      • AM sends request asking 4 containers on host H1.
      • Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at this point, sets the value against H1 to
        zero in its aggregate request-table for all apps.
      • In the mean-while AM gets to need 3 more containers, so a total of 7 including the 4 from previous request.
      • Today, AM sends the absolute number of 7 against H1 to RM as part of its request table.
      • RM seems to be overriding its earlier value of zero against H1 to 7 against H1. And thus allocating 7 more
        containers.
      • AM already gets 4 in this scheduling iteration, but gets 7 more, a total of 11 instead of the required 7.
      1. YARN-110.patch
        6 kB
        Arun C Murthy

        Issue Links

          Activity

          Hide
          Arun C Murthy added a comment -

          I thought more about this... I've spent time on this in the beginning and I don't think a delta protocol is appropriate - particularly since the RM and AM aren't 'in sync' ever.

          Yes, this could lead to some waste, but the system is eventually consistent.

          Show
          Arun C Murthy added a comment - I thought more about this... I've spent time on this in the beginning and I don't think a delta protocol is appropriate - particularly since the RM and AM aren't 'in sync' ever. Yes, this could lead to some waste, but the system is eventually consistent.
          Hide
          Sharad Agarwal added a comment -

          >> Yes, this could lead to some waste, but the system is eventually consistent.

          There will be lot of waste. Particularly for applications which are ramping up the requests with time, and not putting all the requests upfront.
          Another issue will be the sub-optimal scheduling in AM.
          In the above example:

          • AM asks for 3 additional containers(total 7) on different host H2.
          • The request table in AM will get overwritten with 4 on H1 and 3 on H2.
          • RM may allocate containers on H1 or H2. But in reality it should only try to assign on H2.
          • If RM gives containers to AM on H1 first, AM will do off host assignments and will release the ones on H2.
          Show
          Sharad Agarwal added a comment - >> Yes, this could lead to some waste, but the system is eventually consistent. There will be lot of waste. Particularly for applications which are ramping up the requests with time, and not putting all the requests upfront. Another issue will be the sub-optimal scheduling in AM. In the above example: AM asks for 3 additional containers(total 7) on different host H2. The request table in AM will get overwritten with 4 on H1 and 3 on H2. RM may allocate containers on H1 or H2. But in reality it should only try to assign on H2. If RM gives containers to AM on H1 first, AM will do off host assignments and will release the ones on H2.
          Hide
          Sharad Agarwal added a comment -

          For clarity, describing the full example:

          • AM asks 4 containers on H1.
          • Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at this point, sets the value against H1 to
            zero in its aggregate request-table for all apps.
          • In the mean-while AM gets to need 3 more containers on H2, so a total of 7 (4(H1), 3(H2))
          • The request table in RM will get overwritten with (4(H1), 3(H2)), total 7.
          • RM may do locality aware allocations on H1 or H2. But in reality it should only try to assign on H2.
          • If RM gives containers to AM on H1 first, AM will do off host assignments and will release the ones on H2.
          Show
          Sharad Agarwal added a comment - For clarity, describing the full example: AM asks 4 containers on H1. Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at this point, sets the value against H1 to zero in its aggregate request-table for all apps. In the mean-while AM gets to need 3 more containers on H2, so a total of 7 (4(H1), 3(H2)) The request table in RM will get overwritten with (4(H1), 3(H2)), total 7. RM may do locality aware allocations on H1 or H2. But in reality it should only try to assign on H2. If RM gives containers to AM on H1 first, AM will do off host assignments and will release the ones on H2.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Reopening the issue as the discussion is still happening.

          Show
          Vinod Kumar Vavilapalli added a comment - Reopening the issue as the discussion is still happening.
          Hide
          Amol Kekre added a comment -

          Any updates on this?

          Show
          Amol Kekre added a comment - Any updates on this?
          Hide
          Arun C Murthy added a comment -

          Simple patch to resolve the difference b/w the AM's view of the world with the RM within the 'transaction' i.e. the allocate call.

          The essential idea is for the RM to account for the newly allocated containers since the last AM heartbeat while updating #containers for * (ANY).

          Show
          Arun C Murthy added a comment - Simple patch to resolve the difference b/w the AM's view of the world with the RM within the 'transaction' i.e. the allocate call. The essential idea is for the RM to account for the newly allocated containers since the last AM heartbeat while updating #containers for * (ANY).
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12546372/YARN-110.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/48//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546372/YARN-110.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javac. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/48//console This message is automatically generated.

            People

            • Assignee:
              Arun C Murthy
              Reporter:
              Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:

                Development