Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9968

Public Localizer is exiting in NodeManager due to NullPointerException

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.0
    • Fix Version/s: 3.3.0, 3.2.2, 3.1.4
    • Component/s: nodemanager
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The Public Localizer is encountering a NullPointerException and exiting.

      ERROR localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(995)) - Error: Shutting down
      java.lang.NullPointerException
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:981)
      
      INFO  localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(997)) - Public cache exiting
      

      The NodeManager still keeps on running. Subsequent localization events for containers keep encountering the below error, resulting in failed Localization of all new containers.

      ERROR localizer.ResourceLocalizationService (ResourceLocalizationService.java:addResource(920)) - Failed to submit rsrc { { hdfs://namespace/raw/user/.staging/job/conf.xml 1572071824603, FILE, null },pending,[(container_e30_1571858463080_48304_01_000134)],12513553420029113,FAILED} for download. Either queue is full or threadpool is shutdown.
      java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ExecutorCompletionService$QueueingFuture@55c7fa21 rejected from org.apache.hadoop.util.concurrent.HadoopThreadPoolExecutor@46067edd[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 382286]
              at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
              at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
              at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
              at java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:181)
              at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:899)
      

      When this happens, the NodeManager becomes usable only after a restart.

        Attachments

        1. YARN-9968.001.patch
          2 kB
          Tarun Parimi

          Activity

            People

            • Assignee:
              tarunparimi Tarun Parimi
              Reporter:
              tarunparimi Tarun Parimi
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: