Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1692

In secure mode, Datanode process doesn't exit when disks fail.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.204.0, 0.23.0
    • Fix Version/s: 0.20.204.0, 0.23.0
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In secure mode, when disks fail more than volumes tolerated, datanode process doesn't exit properly and it just hangs even though shutdown method is called.

      1. HDFS-1692-v0.23-2.patch
        3 kB
        Bharath Mundlapudi
      2. HDFS-1692-v0.23-1.patch
        4 kB
        Bharath Mundlapudi
      3. HDFS-1692-1.patch
        4 kB
        Bharath Mundlapudi

        Issue Links

          Activity

          Hide
          Bharath Mundlapudi added a comment -

          Attaching the patch.

          Show
          Bharath Mundlapudi added a comment - Attaching the patch.
          Hide
          Boris Shkolnik added a comment -

          +1

          Show
          Boris Shkolnik added a comment - +1
          Hide
          Boris Shkolnik added a comment -

          committed to branch-0.20-security. Thanks Bharath.

          Show
          Boris Shkolnik added a comment - committed to branch-0.20-security. Thanks Bharath.
          Hide
          Eli Collins added a comment -

          @Bharath are you planning to make the same change on trunk?

          • The change to ipc/Server.java should be covered by a HADOOP jira right? What does this particular change address?
          • Should the exit value reflect whether the start was successful?
          Show
          Eli Collins added a comment - @Bharath are you planning to make the same change on trunk? The change to ipc/Server.java should be covered by a HADOOP jira right? What does this particular change address? Should the exit value reflect whether the start was successful?
          Hide
          Bharath Mundlapudi added a comment -

          Yes, I will be porting this one to trunk. We run our clusters in secured mode.

          When a volume tolerated threshold is reached, shutdown is called but datanode continue to run and doesn't exit. This change will address only in secured mode and non secure mode shouldn't have this problem.

          Show
          Bharath Mundlapudi added a comment - Yes, I will be porting this one to trunk. We run our clusters in secured mode. When a volume tolerated threshold is reached, shutdown is called but datanode continue to run and doesn't exit. This change will address only in secured mode and non secure mode shouldn't have this problem.
          Hide
          Eli Collins added a comment -

          Why is the change to ipc/Server.java necessary to accomplish this? Please file a Hadoop jira for this change, as this code does not live in HDFS.

          Show
          Eli Collins added a comment - Why is the change to ipc/Server.java necessary to accomplish this? Please file a Hadoop jira for this change, as this code does not live in HDFS.
          Hide
          Bharath Mundlapudi added a comment -

          As i was tracking this hang issue, i have cleaned up some threads which were not exiting. So the change to ipc/Server.java. But yes we can move this particular code to other Jira. For 0.23, we can do separately.

          Show
          Bharath Mundlapudi added a comment - As i was tracking this hang issue, i have cleaned up some threads which were not exiting. So the change to ipc/Server.java. But yes we can move this particular code to other Jira. For 0.23, we can do separately.
          Hide
          Bharath Mundlapudi added a comment -

          Attaching a patch for version 0.23.

          Show
          Bharath Mundlapudi added a comment - Attaching a patch for version 0.23.
          Hide
          Suresh Srinivas added a comment -

          Minor comments:

          1. DataNode.java - in finally block, LOG.warn is more appropriate
          2. DataXceiverServer.java - Catching AsynchronousCloseException may not be neccessary
          Show
          Suresh Srinivas added a comment - Minor comments: DataNode.java - in finally block, LOG.warn is more appropriate DataXceiverServer.java - Catching AsynchronousCloseException may not be neccessary
          Hide
          Bharath Mundlapudi added a comment -

          I have cleaned up a little bit, like the logging related stuff and few comments. Uploading the patch again.

          Show
          Bharath Mundlapudi added a comment - I have cleaned up a little bit, like the logging related stuff and few comments. Uploading the patch again.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12482713/HDFS-1692-v0.23-2.patch
          against trunk revision 1136230.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/790//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/790//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/790//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12482713/HDFS-1692-v0.23-2.patch against trunk revision 1136230. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/790//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/790//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/790//console This message is automatically generated.
          Hide
          Bharath Mundlapudi added a comment -

          Exiting tests like TestDataNodeExit should check for this condition. So i have not added a new test for this.

          Show
          Bharath Mundlapudi added a comment - Exiting tests like TestDataNodeExit should check for this condition. So i have not added a new test for this.
          Hide
          Suresh Srinivas added a comment -

          I committed the patch to trunk. Thank you Bharath.

          Show
          Suresh Srinivas added a comment - I committed the patch to trunk. Thank you Bharath.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #749 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/749/)
          HDFS-1692. In secure mode, Datanode process doesn't exit when disks fail. Contributed by Bharath Mundlapudi.

          suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1136741
          Files :

          • /hadoop/common/trunk/hdfs/CHANGES.txt
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java
          • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #749 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/749/ ) HDFS-1692 . In secure mode, Datanode process doesn't exit when disks fail. Contributed by Bharath Mundlapudi. suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1136741 Files : /hadoop/common/trunk/hdfs/CHANGES.txt /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
          Hide
          Owen O'Malley added a comment -

          Hadoop 0.20.204.0 was released.

          Show
          Owen O'Malley added a comment - Hadoop 0.20.204.0 was released.

            People

            • Assignee:
              Bharath Mundlapudi
              Reporter:
              Bharath Mundlapudi
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development