Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7585

NodeManager should go unhealthy when state store throws DBException

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      If work preserving recover is enabled the NM will not start up if the state store does not initialise. However if the state store becomes unavailable after that for any reason the NM will not go unhealthy.
      Since the state store is not available new containers can not be started any more and the NM should become unhealthy:

      AMLauncher: Error launching appattempt_1508806289867_268617_000001. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: java.io.IOException: org.iq80.leveldb.DBException: IO error: /dsk/app/var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/028269.log: Read-only file system
      at o.a.h.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
      at o.a.h.y.s.n.cm.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:721)
      ...
      Caused by: java.io.IOException: org.iq80.leveldb.DBException: IO error: /dsk/app/var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/028269.log: Read-only file system
      at o.a.h.y.s.n.r.NMLeveldbStateStoreService.storeApplication(NMLeveldbStateStoreService.java:374)
      at o.a.h.y.s.n.cm.ContainerManagerImpl.startContainerInternal(ContainerManagerImpl.java:848)
      at o.a.h.y.s.n.cm.ContainerManagerImpl.startContainers(ContainerManagerImpl.java:712)
      

      Attachments

        1. YARN-7585.001.patch
          16 kB
          wilfreds#1
        2. YARN-7585.002.patch
          17 kB
          wilfreds#1
        3. YARN-7585.003.patch
          17 kB
          wilfreds#1

        Activity

          People

            wilfreds Wilfred Spiegelenburg
            wilfreds Wilfred Spiegelenburg
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: