Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-73

DFSOutputStream does not close all the sockets

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.23.0
    • Component/s: hdfs-client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When DFSOutputStream writes to multiple blocks, it closes only the socket opened for the last block. When it is done with writing to one block it should close the socket.

      I noticed this when I was fixing HADOOP-3067. After fixing HADOOP-3067, there were still a lot of sockets open (but not enough to fail the tests). These sockets were used to write to blocks.

      1. HDFS-73_0.23_1.patch
        3 kB
        Uma Maheswara Rao G
      2. HDFS-73_0.23.patch
        2 kB
        Uma Maheswara Rao G
      3. HADOOP-3071.patch
        3 kB
        Raghu Angadi

        Activity

        Hide
        hudson Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #742 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/742/)
        HDFS-73. DFSOutputStream does not close all the sockets. Contributed by Uma Maheswara Rao G

        eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157232
        Files :

        • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java
        • /hadoop/common/trunk/hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #742 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/742/ ) HDFS-73 . DFSOutputStream does not close all the sockets. Contributed by Uma Maheswara Rao G eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157232 Files : /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java /hadoop/common/trunk/hdfs/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #833 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/833/)
        HDFS-73. DFSOutputStream does not close all the sockets. Contributed by Uma Maheswara Rao G

        eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157232
        Files :

        • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java
        • /hadoop/common/trunk/hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #833 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/833/ ) HDFS-73 . DFSOutputStream does not close all the sockets. Contributed by Uma Maheswara Rao G eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157232 Files : /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java /hadoop/common/trunk/hdfs/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #750 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/750/)
        HDFS-73. DFSOutputStream does not close all the sockets. Contributed by Uma Maheswara Rao G

        eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157232
        Files :

        • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java
        • /hadoop/common/trunk/hdfs/CHANGES.txt
        Show
        hudson Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #750 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/750/ ) HDFS-73 . DFSOutputStream does not close all the sockets. Contributed by Uma Maheswara Rao G eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157232 Files : /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java /hadoop/common/trunk/hdfs/CHANGES.txt
        Hide
        eli Eli Collins added a comment -

        I've commited this. Thanks Uma!

        Show
        eli Eli Collins added a comment - I've commited this. Thanks Uma!
        Hide
        eli Eli Collins added a comment -

        +1 latest patch looks good. I ran the tests locally and verified the only failures are the ones due to HDFS-2242.

        Show
        eli Eli Collins added a comment - +1 latest patch looks good. I ran the tests locally and verified the only failures are the ones due to HDFS-2242 .
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12490270/HDFS-73_0.23_1.patch
        against trunk revision 1156977.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:
        org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy
        org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery
        org.apache.hadoop.hdfs.server.datanode.TestDataDirs
        org.apache.hadoop.hdfs.server.namenode.TestCheckpoint
        org.apache.hadoop.hdfs.server.namenode.TestGetImageServlet
        org.apache.hadoop.hdfs.server.namenode.TestINodeFile
        org.apache.hadoop.hdfs.server.namenode.TestNNLeaseRecovery
        org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark
        org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings
        org.apache.hadoop.hdfs.TestHDFSServerPorts

        +1 contrib tests. The patch passed contrib unit tests.

        -1 system test framework. The patch failed system test framework compile.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1100//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/1100//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1100//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12490270/HDFS-73_0.23_1.patch against trunk revision 1156977. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery org.apache.hadoop.hdfs.server.datanode.TestDataDirs org.apache.hadoop.hdfs.server.namenode.TestCheckpoint org.apache.hadoop.hdfs.server.namenode.TestGetImageServlet org.apache.hadoop.hdfs.server.namenode.TestINodeFile org.apache.hadoop.hdfs.server.namenode.TestNNLeaseRecovery org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings org.apache.hadoop.hdfs.TestHDFSServerPorts +1 contrib tests. The patch passed contrib unit tests. -1 system test framework. The patch failed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1100//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/1100//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1100//console This message is automatically generated.
        Hide
        umamaheswararao Uma Maheswara Rao G added a comment -

        Test patch results for this change:

        [exec] +1 @author. The patch does not contain any @author tags.

        [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.

        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

        [exec] +1 system test framework. The patch passed system test framework compile.

        Tests also verified.
        TestDFSCli is failing for me, that is not relevant to this change.

        Show
        umamaheswararao Uma Maheswara Rao G added a comment - Test patch results for this change: [exec] +1 @author. The patch does not contain any @author tags. [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] +1 system test framework. The patch passed system test framework compile. Tests also verified. TestDFSCli is failing for me, that is not relevant to this change.
        Hide
        umamaheswararao Uma Maheswara Rao G added a comment -

        Hi Eli,
        Thanks a lot for taking a look!.

        Addressed your comments and updated the patch.

        --Thanks

        Show
        umamaheswararao Uma Maheswara Rao G added a comment - Hi Eli, Thanks a lot for taking a look!. Addressed your comments and updated the patch. --Thanks
        Hide
        eli Eli Collins added a comment -

        Hi Uma,

        Latest patch looks good. Wrt testing how about adding the following assert to all non-final sockets in the file? The append tests will fail this currently and it checks that we don't introduce this bug again.

               try {
        +        assert null == s : "Previous socket unclosed";
                 s = createSocketForPipeline(nodes[0], nodes.length, dfsClient);
        

        Nit: please make the formatting (spacing) in closeStream consistent and generate a new patch that will apply with test-patch.

        Show
        eli Eli Collins added a comment - Hi Uma, Latest patch looks good. Wrt testing how about adding the following assert to all non-final sockets in the file? The append tests will fail this currently and it checks that we don't introduce this bug again. try { + assert null == s : "Previous socket unclosed"; s = createSocketForPipeline(nodes[0], nodes.length, dfsClient); Nit: please make the formatting (spacing) in closeStream consistent and generate a new patch that will apply with test-patch.
        Hide
        umamaheswararao Uma Maheswara Rao G added a comment -

        A patch for 0.23 trunk version.

        Show
        umamaheswararao Uma Maheswara Rao G added a comment - A patch for 0.23 trunk version.
        Hide
        kevin.faro Kevin Faro added a comment -

        Has there been any movement on this issue? Or has a work-around been identified? I am still seeing file-descriptors not being closed properly in 0.20.2-cdh3u0.

        Show
        kevin.faro Kevin Faro added a comment - Has there been any movement on this issue? Or has a work-around been identified? I am still seeing file-descriptors not being closed properly in 0.20.2-cdh3u0.
        Hide
        rangadi Raghu Angadi added a comment -

        Thanks Dhruba. The order does not matter. In my earlier changes, I had moved setting blockStream to null to the end and later forgot to move it back. I can change the order.

        Show
        rangadi Raghu Angadi added a comment - Thanks Dhruba. The order does not matter. In my earlier changes, I had moved setting blockStream to null to the end and later forgot to move it back. I can change the order.
        Hide
        dhruba dhruba borthakur added a comment -

        Code looks good. One question: the original code closes blockStream first and then close blockReplyStream. The new code closes blockReplyStream first. Is this on purpose?

        Show
        dhruba dhruba borthakur added a comment - Code looks good. One question: the original code closes blockStream first and then close blockReplyStream. The new code closes blockReplyStream first. Is this on purpose?
        Hide
        rangadi Raghu Angadi added a comment -

        This does not affect 0.16. Closing s.getOutputStream() also closes the socket (not just the otherway round). o.a.h.SocketOutputStream() should do the same. But still I think patch is good as it is.. it cleans up the code a bit and closes the socket if connection fails inside createBlockOutputStream().

        Show
        rangadi Raghu Angadi added a comment - This does not affect 0.16. Closing s.getOutputStream() also closes the socket (not just the otherway round). o.a.h.SocketOutputStream() should do the same. But still I think patch is good as it is.. it cleans up the code a bit and closes the socket if connection fails inside createBlockOutputStream().
        Hide
        rangadi Raghu Angadi added a comment -

        Attached patch adds closeBlockStreams().

        Without this patch there around 200 sockets to DataNodes in CLOSE_WAIT at the end of TestCrcCorruption unit test. With the patch there are no sockets open to DataNodes.

        Show
        rangadi Raghu Angadi added a comment - Attached patch adds closeBlockStreams() . Without this patch there around 200 sockets to DataNodes in CLOSE_WAIT at the end of TestCrcCorruption unit test. With the patch there are no sockets open to DataNodes.

          People

          • Assignee:
            umamaheswararao Uma Maheswara Rao G
            Reporter:
            rangadi Raghu Angadi
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development