Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6091

the AppMaster register failed when use Docker on LinuxContainer

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Implemented
    • Affects Version/s: 2.8.1
    • Fix Version/s: None
    • Component/s: nodemanager, yarn
    • Labels:
    • Environment:

      CentOS

      Description

      In some servers, When I use Docker on LinuxContainer, I found the aciton that AppMaster register to Resourcemanager failed. But didn't happen in other servers.
      I found the pclose (in container-executor.c) return different value in different server, even though the process which is launched by popen is running normally. Some server return 0, and others return 13.
      Because yarn regard the application as failed application when pclose return nonzero, and yarn will remove the AMRMToken, then the AppMaster register failed because Resourcemanager have removed this applicaiton's token.
      In container-executor.c, the judgement condition is whether the return code is zero. But man the pclose, the document tells that "pclose return -1" represent wrong. So I change the judgement condition, then slove this problem.

        Attachments

        1. YARN-6091.002.patch
          1 kB
          Eric Badger
        2. YARN-6091.001.patch
          1 kB
          Eric Badger

          Issue Links

            Activity

              People

              • Assignee:
                ebadger Eric Badger
                Reporter:
                zhengchenyu zhengchenyu
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 336h
                  336h
                  Remaining:
                  Remaining Estimate - 336h
                  336h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified