Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6091

the AppMaster register failed when use Docker on LinuxContainer

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Implemented
    • 2.8.1
    • None
    • nodemanager, yarn
    • CentOS

    Description

      In some servers, When I use Docker on LinuxContainer, I found the aciton that AppMaster register to Resourcemanager failed. But didn't happen in other servers.
      I found the pclose (in container-executor.c) return different value in different server, even though the process which is launched by popen is running normally. Some server return 0, and others return 13.
      Because yarn regard the application as failed application when pclose return nonzero, and yarn will remove the AMRMToken, then the AppMaster register failed because Resourcemanager have removed this applicaiton's token.
      In container-executor.c, the judgement condition is whether the return code is zero. But man the pclose, the document tells that "pclose return -1" represent wrong. So I change the judgement condition, then slove this problem.

      Attachments

        1. YARN-6091.001.patch
          1 kB
          Eric Badger
        2. YARN-6091.002.patch
          1 kB
          Eric Badger

        Issue Links

          Activity

            People

              ebadger Eric Badger
              zhengchenyu zhengchenyu
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 336h
                  336h
                  Remaining:
                  Remaining Estimate - 336h
                  336h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified