Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-440

Datanode loops forever if it cannot create directories

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.2.1
    • Component/s: Ozone Datanode
    • Labels:
    • Target Version/s:

      Description

      Datanode starts but runs in a tight loop forever if it cannot create the DataNode ID directory e.g. due to permissions issues. I encountered this by having a typo in my ozone-site.xml for ozone.scm.datanode.id.

      In just a few minutes the DataNode had generated over 20GB of log+out files with the following exception:

      2018-09-12 17:28:20,649 WARN org.apache.hadoop.util.concurrent.ExecutorHelper: Caught exception in thread Datanode State Machine Thread - 2
      63:
      java.io.IOException: Unable to create datanode ID directories.
      at org.apache.hadoop.ozone.container.common.helpers.ContainerUtils.writeDatanodeDetailsTo(ContainerUtils.java:211)
      at org.apache.hadoop.ozone.container.common.states.datanode.InitDatanodeState.persistContainerDatanodeDetails(InitDatanodeState.java:131)
      at org.apache.hadoop.ozone.container.common.states.datanode.InitDatanodeState.call(InitDatanodeState.java:111)
      at org.apache.hadoop.ozone.container.common.states.datanode.InitDatanodeState.call(InitDatanodeState.java:50)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      2018-09-12 17:28:20,648 WARN org.apache.hadoop.util.concurrent.ExecutorHelper: Execution exception when running task in Datanode State Mach
      ine Thread - 160
      2018-09-12 17:28:20,650 WARN org.apache.hadoop.util.concurrent.ExecutorHelper: Caught exception in thread Datanode State Machine Thread - 1
      60:
      java.io.IOException: Unable to create datanode ID directories.
      at org.apache.hadoop.ozone.container.common.helpers.ContainerUtils.writeDatanodeDetailsTo(ContainerUtils.java:211)
      at org.apache.hadoop.ozone.container.common.states.datanode.InitDatanodeState.persistContainerDatanodeDetails(InitDatanodeState.java:131)
      at org.apache.hadoop.ozone.container.common.states.datanode.InitDatanodeState.call(InitDatanodeState.java:111)
      at org.apache.hadoop.ozone.container.common.states.datanode.InitDatanodeState.call(InitDatanodeState.java:50)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)

      We should just exit since this is a fatal issue.

        Attachments

          Activity

            People

            • Assignee:
              bharat Bharat Viswanadham
              Reporter:
              arp Arpit Agarwal

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment