Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.23.3, 3.0.0, 2.0.2-alpha
    • Fix Version/s: 0.23.3, 2.0.2-alpha
    • Component/s: hdfs-client
    • Labels:
      None

      Description

      The deadlock is between DFSOutputStream#close() and DFSClient#close().

        Activity

        Hide
        Kihwal Lee added a comment -

        DFSClient#getLeaseRenewer() doesn't have to be synchronized since LeaseManager.Factory methods are synchronized. Multiple callers are still guaranteed to get a single live renewer back.

        Java stack information for the threads listed above:
        ===================================================
        "Thread-28":
                at
        org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1729)
                - waiting to lock <0xb5a05dc8> (a
        org.apache.hadoop.hdfs.DFSOutputStream)
                at
        org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:674)
                at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:691)
                - locked <0xb5a06ed8> (a org.apache.hadoop.hdfs.DFSClient)
                at
        org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
                at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2386)
                - locked <0xb44b00e8> (a org.apache.hadoop.fs.FileSystem$Cache)
                at
        org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2403)
                - locked <0xb44b0100> (a
        org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer)
                at
        org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
        "Thread-1175":
                at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:538)
                - waiting to lock <0xb5a06ed8> (a org.apache.hadoop.hdfs.DFSClient)
                at org.apache.hadoop.hdfs.DFSClient.endFileLease(DFSClient.java:550)
                at
        org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1757)
                - locked <0xb5a05dc8> (a org.apache.hadoop.hdfs.DFSOutputStream)
                at
        org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:66)
                at
        org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:99)
                at
        org.apache.hadoop.hdfs.TestDatanodeDeath$Workload.run(TestDatanodeDeath.java:101)
        
        Show
        Kihwal Lee added a comment - DFSClient#getLeaseRenewer() doesn't have to be synchronized since LeaseManager.Factory methods are synchronized. Multiple callers are still guaranteed to get a single live renewer back. Java stack information for the threads listed above: =================================================== "Thread-28": at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1729) - waiting to lock <0xb5a05dc8> (a org.apache.hadoop.hdfs.DFSOutputStream) at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:674) at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:691) - locked <0xb5a06ed8> (a org.apache.hadoop.hdfs.DFSClient) at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539) at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2386) - locked <0xb44b00e8> (a org.apache.hadoop.fs.FileSystem$Cache) at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2403) - locked <0xb44b0100> (a org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) "Thread-1175": at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:538) - waiting to lock <0xb5a06ed8> (a org.apache.hadoop.hdfs.DFSClient) at org.apache.hadoop.hdfs.DFSClient.endFileLease(DFSClient.java:550) at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1757) - locked <0xb5a05dc8> (a org.apache.hadoop.hdfs.DFSOutputStream) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:66) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:99) at org.apache.hadoop.hdfs.TestDatanodeDeath$Workload.run(TestDatanodeDeath.java:101)
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12542787/hdfs-3861.patch.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

        org.apache.hadoop.hdfs.TestHftpDelegationToken
        org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3109//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/3109//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3109//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12542787/hdfs-3861.patch.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestHftpDelegationToken org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3109//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/3109//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3109//console This message is automatically generated.
        Hide
        Kihwal Lee added a comment -
        • The test failures are not related to this patch.
        • No test was added. Existing test case exposed this bug (TestDataNodeDeath).
        • The findbugs warning is not caused by this patch.
        Show
        Kihwal Lee added a comment - The test failures are not related to this patch. No test was added. Existing test case exposed this bug (TestDataNodeDeath). The findbugs warning is not caused by this patch.
        Hide
        Colin Patrick McCabe added a comment -

        Looks good to me.

        Show
        Colin Patrick McCabe added a comment - Looks good to me.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        +1 patch looks good. Thanks for fixing this.

        Show
        Tsz Wo Nicholas Sze added a comment - +1 patch looks good. Thanks for fixing this.
        Hide
        Daryn Sharp added a comment -

        Thanks Kihwal!

        Show
        Daryn Sharp added a comment - Thanks Kihwal!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-0.23-Build #360 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/360/)
        merge -r 1379092:1379093 from trunk. FIXES: HDFS-3861 (Revision 1379097)

        Result = UNSTABLE
        daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379097
        Files :

        • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #360 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/360/ ) merge -r 1379092:1379093 from trunk. FIXES: HDFS-3861 (Revision 1379097) Result = UNSTABLE daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379097 Files : /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #1151 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1151/)
        HDFS-3861. Deadlock in DFSClient (Kihwal Lee via daryn) (Revision 1379093)

        Result = FAILURE
        daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379093
        Files :

        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1151 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1151/ ) HDFS-3861 . Deadlock in DFSClient (Kihwal Lee via daryn) (Revision 1379093) Result = FAILURE daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379093 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #1182 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1182/)
        HDFS-3861. Deadlock in DFSClient (Kihwal Lee via daryn) (Revision 1379093)

        Result = SUCCESS
        daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379093
        Files :

        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1182 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1182/ ) HDFS-3861 . Deadlock in DFSClient (Kihwal Lee via daryn) (Revision 1379093) Result = SUCCESS daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379093 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java

          People

          • Assignee:
            Kihwal Lee
            Reporter:
            Kihwal Lee
          • Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development