Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-4544

Error in deleting blocks should not do check disk, for all types of errors

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.1, 2.0.3-alpha
    • Fix Version/s: 1.2.0, 0.23.7, 2.1.0-beta
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The following code in Datanode.java

            try {
              if (blockScanner != null) {
                blockScanner.deleteBlocks(toDelete);
              }
              data.invalidate(toDelete);
            } catch(IOException e) {
              checkDiskError();
              throw e;
            }
      

      causes check disk to happen in case of any errors during invalidate.

      We have seen errors like :

      2013-03-02 00:08:28,849 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected error trying to delete block blk_-2973118207682441648_225738165. BlockInfo not found in volumeMap.

      And all such errors trigger check disk, making the clients timeout.

      1. HDFS-4544.patch
        0.5 kB
        Arpit Agarwal
      2. HDFS-4544.branch-1.1.patch
        0.6 kB
        Arpit Agarwal
      3. HDFS-4544.trunk.1.patch
        0.9 kB
        Arpit Agarwal

        Issue Links

          Activity

          Hide
          Amareshwari Sriramadasu added a comment -

          Looking FSDataSet.invalidate code, I see all the errors could be following :

          1. The blockinfo not found in volumeMap
          2. The block not found in blockMap
          3. There is not volume corresponding to the block
          4. The parent directory does not exist

          Also the delete itself happens asynchronously. So none of the errors above could be because of disk errors. I propose we go ahead remove check the checkDiskError() call from the above try catch loop.

          Show
          Amareshwari Sriramadasu added a comment - Looking FSDataSet.invalidate code, I see all the errors could be following : The blockinfo not found in volumeMap The block not found in blockMap There is not volume corresponding to the block The parent directory does not exist Also the delete itself happens asynchronously. So none of the errors above could be because of disk errors. I propose we go ahead remove check the checkDiskError() call from the above try catch loop.
          Hide
          Suresh Srinivas added a comment -

          +1. This seems like a good change.

          Show
          Suresh Srinivas added a comment - +1. This seems like a good change.
          Hide
          Arpit Agarwal added a comment -

          It looks like no patch is needed for trunk.

          Show
          Arpit Agarwal added a comment - It looks like no patch is needed for trunk.
          Hide
          Arpit Agarwal added a comment -

          Reattaching the correct patch.

          Show
          Arpit Agarwal added a comment - Reattaching the correct patch.
          Hide
          Suresh Srinivas added a comment -

          +1 for the patch. It seems like a trivial change to not check for disk error. I am going to commit this change shortly.

          Show
          Suresh Srinivas added a comment - +1 for the patch. It seems like a trivial change to not check for disk error. I am going to commit this change shortly.
          Hide
          Suresh Srinivas added a comment -

          I think this also applies to trunk. Some of the code is reorganized in trunk. Please see FSDatasetImpl#invalidate().

          Show
          Suresh Srinivas added a comment - I think this also applies to trunk. Some of the code is reorganized in trunk. Please see FSDatasetImpl#invalidate().
          Hide
          Arpit Agarwal added a comment -

          Updated patch branch-1 and for trunk.

          Show
          Arpit Agarwal added a comment - Updated patch branch-1 and for trunk.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12572199/HDFS-4544.trunk.1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4042//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4042//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572199/HDFS-4544.trunk.1.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4042//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4042//console This message is automatically generated.
          Hide
          Arpit Agarwal added a comment -

          A test case is not needed for this patch since we are just removing the checkDiskError call.

          Show
          Arpit Agarwal added a comment - A test case is not needed for this patch since we are just removing the checkDiskError call.
          Hide
          Harsh J added a comment -

          +1, the trunk change looks good.

          Show
          Harsh J added a comment - +1, the trunk change looks good.
          Hide
          Suresh Srinivas added a comment -

          +1 for the trunk and branch-1 patches.

          Show
          Suresh Srinivas added a comment - +1 for the trunk and branch-1 patches.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk-Commit #3423 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3423/)
          HDFS-4544. Error in deleting blocks should not do check disk, for all types of errors. Contributed by Arpit Agarwal. (Revision 1453436)

          Result = SUCCESS
          suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453436
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
          Show
          Hudson added a comment - Integrated in Hadoop-trunk-Commit #3423 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3423/ ) HDFS-4544 . Error in deleting blocks should not do check disk, for all types of errors. Contributed by Arpit Agarwal. (Revision 1453436) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453436 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
          Hide
          Suresh Srinivas added a comment -

          I committed the patch to trunk, branch-2 and branch-1.

          Thank you Arpit. Thank you Amareshwari for diagnosing the issue and creating the bug.

          Show
          Suresh Srinivas added a comment - I committed the patch to trunk, branch-2 and branch-1. Thank you Arpit. Thank you Amareshwari for diagnosing the issue and creating the bug.
          Hide
          Kihwal Lee added a comment -

          Committed to branch-0.23.

          Show
          Kihwal Lee added a comment - Committed to branch-0.23.
          Hide
          Arpit Agarwal added a comment -

          Thanks for reviewing and committing.

          Show
          Arpit Agarwal added a comment - Thanks for reviewing and committing.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Yarn-trunk #148 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/148/)
          HDFS-4544. Error in deleting blocks should not do check disk, for all types of errors. Contributed by Arpit Agarwal. (Revision 1453436)

          Result = SUCCESS
          suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453436
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
          Show
          Hudson added a comment - Integrated in Hadoop-Yarn-trunk #148 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/148/ ) HDFS-4544 . Error in deleting blocks should not do check disk, for all types of errors. Contributed by Arpit Agarwal. (Revision 1453436) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453436 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #546 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/546/)
          svn merge -c 1453436 Merging from trunk to branch-0.23 to fix HDFS-4544. (Revision 1453548)

          Result = UNSTABLE
          kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453548
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #546 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/546/ ) svn merge -c 1453436 Merging from trunk to branch-0.23 to fix HDFS-4544 . (Revision 1453548) Result = UNSTABLE kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453548 Files : /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1337 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1337/)
          HDFS-4544. Error in deleting blocks should not do check disk, for all types of errors. Contributed by Arpit Agarwal. (Revision 1453436)

          Result = SUCCESS
          suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453436
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1337 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1337/ ) HDFS-4544 . Error in deleting blocks should not do check disk, for all types of errors. Contributed by Arpit Agarwal. (Revision 1453436) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453436 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1365 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1365/)
          HDFS-4544. Error in deleting blocks should not do check disk, for all types of errors. Contributed by Arpit Agarwal. (Revision 1453436)

          Result = SUCCESS
          suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453436
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1365 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1365/ ) HDFS-4544 . Error in deleting blocks should not do check disk, for all types of errors. Contributed by Arpit Agarwal. (Revision 1453436) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1453436 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
          Hide
          Matt Foley added a comment -

          Closed upon release of Hadoop 1.2.0.

          Show
          Matt Foley added a comment - Closed upon release of Hadoop 1.2.0.

            People

            • Assignee:
              Arpit Agarwal
              Reporter:
              Amareshwari Sriramadasu
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development