Hadoop YARN
  1. Hadoop YARN
  2. YARN-426

Failure to download a public resource on a node prevents further downloads of the resource from that node

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0.3-alpha, 0.23.6
    • Fix Version/s: 0.23.7, 2.1.0-beta
    • Component/s: nodemanager
    • Labels:
      None

      Description

      If the NM encounters an error while downloading a public resource, it fails to empty the list of request events corresponding to the resource request in attempts. If the same public resource is subsequently requested on that node, PublicLocalizer.addResource will skip the download since it will mistakenly believe a download of that resource is already in progress. At that point any container that requests the public resource will just hang in the LOCALIZING state.

      1. YARN-426.patch
        10 kB
        Jason Lowe

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        1d 4h 7m 1 Jason Lowe 27/Feb/13 00:15
        Patch Available Patch Available Resolved Resolved
        15h 20m 1 Robert Joseph Evans 27/Feb/13 15:35
        Resolved Resolved Closed Closed
        181d 6h 39m 1 Arun C Murthy 27/Aug/13 23:15
        Allen Wittenauer made changes -
        Fix Version/s 3.0.0 [ 12323268 ]
        Arun C Murthy made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #1358 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1358/)
        YARN-426. Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450807)

        Result = FAILURE
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1450807
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1358 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1358/ ) YARN-426 . Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450807) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1450807 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #1330 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1330/)
        YARN-426. Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450807)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1450807
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1330 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1330/ ) YARN-426 . Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450807) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1450807 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-0.23-Build #539 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/539/)
        svn merge -c 1450807 FIXES: YARN-426. Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450813)

        Result = UNSTABLE
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1450813
        Files :

        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
        • /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #539 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/539/ ) svn merge -c 1450807 FIXES: YARN-426 . Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450813) Result = UNSTABLE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1450813 Files : /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Yarn-trunk #141 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/141/)
        YARN-426. Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450807)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1450807
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-Yarn-trunk #141 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/141/ ) YARN-426 . Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450807) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1450807 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-trunk-Commit #3390 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3390/)
        YARN-426. Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450807)

        Result = SUCCESS
        bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1450807
        Files :

        • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
        • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
        Show
        Hudson added a comment - Integrated in Hadoop-trunk-Commit #3390 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3390/ ) YARN-426 . Failure to download a public resource prevents further downloads (Jason Lowe via bobby) (Revision 1450807) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1450807 Files : /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
        Robert Joseph Evans made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Fix Version/s 0.23.7 [ 12323953 ]
        Fix Version/s 2.0.4-beta [ 12324029 ]
        Fix Version/s 3.0.0 [ 12323268 ]
        Resolution Fixed [ 1 ]
        Hide
        Robert Joseph Evans added a comment -

        Thanks Jason,

        I put this into trunk, branch-2, and branch-0.23

        Show
        Robert Joseph Evans added a comment - Thanks Jason, I put this into trunk, branch-2, and branch-0.23
        Hide
        Robert Joseph Evans added a comment -

        The patch looks good to me. +1 I'll check it in.

        Show
        Robert Joseph Evans added a comment - The patch looks good to me. +1 I'll check it in.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12571093/YARN-426.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 tests included appear to have a timeout.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-YARN-Build/436//testReport/
        Console output: https://builds.apache.org/job/PreCommit-YARN-Build/436//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571093/YARN-426.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 tests included appear to have a timeout. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/436//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/436//console This message is automatically generated.
        Jason Lowe made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Target Version/s 0.23.7, 2.0.4-beta [ 12323953, 12324029 ]
        Jason Lowe made changes -
        Field Original Value New Value
        Attachment YARN-426.patch [ 12571093 ]
        Hide
        Jason Lowe added a comment -

        Patch to ensure all queued attempts for a public resource are notified of a failed localization and the attempts are dequeued.

        Show
        Jason Lowe added a comment - Patch to ensure all queued attempts for a public resource are notified of a failed localization and the attempts are dequeued.
        Jason Lowe created issue -

          People

          • Assignee:
            Jason Lowe
            Reporter:
            Jason Lowe
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development