Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4530

LocalizedResource trigger a NPE Cause the NodeManager exit

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0, 2.7.1
    • 2.9.0, 3.0.0-alpha1
    • None
    • None
    • Reviewed

    Description

      In our cluster, I found that LocalizedResource download failed trigger a NPE Cause the NodeManager shutdown.

      2015-12-29 17:18:33,706 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://ns3:8020/user/username/projects/user_insight/lookalike/oozie/workflow/conf/hive-site.xml transitioned from DOWNLOADING to FAILED
      2015-12-29 17:18:33,708 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Downloading public rsrc:{ hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/user_insight_pig_udf-0.0.1-SNAPSHOT-jar-with-dependencies.jar, 1451380519635, FILE, null }
      2015-12-29 17:18:33,710 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar, 1451380519452, FILE, null },pending,[(container_1451039893865_261670_01_000578)],42332661980495938,DOWNLOADING}
      java.io.IOException: Resource hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar changed on src filesystem (expected 1451380519452, was 1451380611793
      	at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
      	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:276)
      	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      2015-12-29 17:18:33,710 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar transitioned from DOWNLOADING to FAILED
      2015-12-29 17:18:33,710 FATAL org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Error: Shutting down
      java.lang.NullPointerException at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712)
      2015-12-29 17:18:33,710 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Public cache exiting
      

      Attachments

        1. YARN-4530.1.patch
          2 kB
          Wally Tang

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tangshangwen Wally Tang
            tangshangwen Wally Tang
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment