Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-3696

Create files with WebHdfsFileSystem goes OOM when file size is big

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: 1.1.0, 0.23.3, 2.0.2-alpha
    • Component/s: webhdfs
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      When doing "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes OOM if the file size is large. When I tested, 20MB files were fine, but 200MB didn't work.

      I also tried reading a large file by issuing "-cat" and piping to a slow sink in order to force buffering. The read path didn't have this problem. The memory consumption stayed the same regardless of progress.

      1. h3696_20120724.patch
        9 kB
        Tsz Wo Nicholas Sze
      2. h3696_20120724_b-1.patch
        9 kB
        Tsz Wo Nicholas Sze
      3. h3696_20120724_0.23.patch
        10 kB
        Tsz Wo Nicholas Sze

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          4d 13h 20m 1 Tsz Wo Nicholas Sze 25/Jul/12 04:43
          Patch Available Patch Available Resolved Resolved
          20h 5m 1 Tsz Wo Nicholas Sze 26/Jul/12 00:49
          Resolved Resolved Reopened Reopened
          5m 17s 1 Tsz Wo Nicholas Sze 26/Jul/12 00:54
          Reopened Reopened Resolved Resolved
          1d 6h 39m 1 Tsz Wo Nicholas Sze 27/Jul/12 07:34
          Resolved Resolved Closed Closed
          76d 11h 12m 1 Arun C Murthy 11/Oct/12 18:46
          Vinayakumar B made changes -
          Component/s webhdfs [ 12319200 ]
          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Suresh Srinivas made changes -
          Fix Version/s 1.1.0 [ 12317959 ]
          Target Version/s 1.1.0 [ 12317959 ]
          Arun C Murthy made changes -
          Fix Version/s 2.0.2-alpha [ 12322472 ]
          Fix Version/s 1.1.0 [ 12317959 ]
          Matt Foley made changes -
          Fix Version/s 1.1.0 [ 12317959 ]
          Fix Version/s 1.1.1 [ 12321656 ]
          Hide
          Matt Foley added a comment -

          Due to delays in 1.1.0, incorporated in 1.1.0 from 1.1.1.

          Show
          Matt Foley added a comment - Due to delays in 1.1.0, incorporated in 1.1.0 from 1.1.1.
          Tsz Wo Nicholas Sze made changes -
          Fix Version/s 1.1.1 [ 12321656 ]
          Fix Version/s 1.2.0 [ 12321657 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Merged to branch-1.1.

          Show
          Tsz Wo Nicholas Sze added a comment - Merged to branch-1.1.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #326 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/326/)
          HDFS-3696. Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1366293)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1366293
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/WebHdfsTestUtil.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #326 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/326/ ) HDFS-3696 . Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1366293) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1366293 Files : /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/WebHdfsTestUtil.java
          Tsz Wo Nicholas Sze made changes -
          Status Reopened [ 4 ] Resolved [ 5 ]
          Fix Version/s 1.2.0 [ 12321657 ]
          Resolution Fixed [ 1 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this to 0.23 and branch-1.

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this to 0.23 and branch-1.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Robert, thanks for taking a look.

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Robert, thanks for taking a look.
          Hide
          Robert Joseph Evans added a comment -

          Comment Thanks for the patch for branch-0.23. +1 (non-binding) for it. I reviewed the change and ran the tests.

          Show
          Robert Joseph Evans added a comment - Comment Thanks for the patch for branch-0.23. +1 (non-binding) for it. I reviewed the change and ran the tests.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1148 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1148/)
          HDFS-3696. Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365839)

          Result = FAILURE
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365839
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1148 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1148/ ) HDFS-3696 . Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365839) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365839 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1116 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1116/)
          HDFS-3696. Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365839)

          Result = FAILURE
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365839
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1116 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1116/ ) HDFS-3696 . Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365839) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365839 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #325 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/325/)
          svn merge -c -1365843 for reverting HDFS-3696 since the test cannot be compiled. (Revision 1365846)
          svn merge -c 1365839 from trunk for HDFS-3696. Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365843)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365846
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java

          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365843
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
          • /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #325 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/325/ ) svn merge -c -1365843 for reverting HDFS-3696 since the test cannot be compiled. (Revision 1365846) svn merge -c 1365839 from trunk for HDFS-3696 . Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365843) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365846 Files : /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365843 Files : /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Tsz Wo Nicholas Sze made changes -
          Attachment h3696_20120724_b-1.patch [ 12537952 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          h3696_20120724_b-1.patch: for branch-1.

          Show
          Tsz Wo Nicholas Sze added a comment - h3696_20120724_b-1.patch: for branch-1.
          Tsz Wo Nicholas Sze made changes -
          Attachment h3696_20120724_0.23.patch [ 12537950 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          h3696_20120724_0.23.patch: for 0.23.

          Show
          Tsz Wo Nicholas Sze added a comment - h3696_20120724_0.23.patch: for 0.23.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #2543 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2543/)
          HDFS-3696. Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365839)

          Result = FAILURE
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365839
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2543 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2543/ ) HDFS-3696 . Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365839) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365839 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #2523 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2523/)
          HDFS-3696. Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365839)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365839
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2523 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2523/ ) HDFS-3696 . Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365839) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365839 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #2587 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2587/)
          HDFS-3696. Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365839)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365839
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2587 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2587/ ) HDFS-3696 . Set chunked streaming mode in WebHdfsFileSystem write operations to get around a Java library bug causing OutOfMemoryError. (Revision 1365839) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365839 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
          Tsz Wo Nicholas Sze made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          0.23 needs a new patch since the test cannot be compiled.

          Show
          Tsz Wo Nicholas Sze added a comment - 0.23 needs a new patch since the test cannot be compiled.
          Tsz Wo Nicholas Sze made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Fix Version/s 3.0.0 [ 12320356 ]
          Fix Version/s 2.2.0-alpha [ 12322472 ]
          Resolution Fixed [ 1 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Thanks Suresh for reviewing it.

          I have committed this.

          Show
          Tsz Wo Nicholas Sze added a comment - Thanks Suresh for reviewing it. I have committed this.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Any idea why it performed best at 32KB?

          Since TCP has a max packet size of 64kB, I thought the best should be somehow close to 64kB. I was surprise that 32kB was better than 48kB in my experiment. Perhaps it was due to some implementation details in the Java library.

          Show
          Tsz Wo Nicholas Sze added a comment - > Any idea why it performed best at 32KB? Since TCP has a max packet size of 64kB, I thought the best should be somehow close to 64kB. I was surprise that 32kB was better than 48kB in my experiment. Perhaps it was due to some implementation details in the Java library.
          Hide
          Kihwal Lee added a comment -

          I tried several chunk sizes for writing 300MB files. 32kB was the best in my test.

          Any idea why it performed best at 32KB?

          Kihwal do you want to run some tests with this patch as well?

          I think it's okay if the memory consumption is under control. All I did was a simple put.

          Show
          Kihwal Lee added a comment - I tried several chunk sizes for writing 300MB files. 32kB was the best in my test. Any idea why it performed best at 32KB? Kihwal do you want to run some tests with this patch as well? I think it's okay if the memory consumption is under control. All I did was a simple put.
          Hide
          Suresh Srinivas added a comment -

          Kihwal do you want to run some tests with this patch as well?

          Show
          Suresh Srinivas added a comment - Kihwal do you want to run some tests with this patch as well?
          Hide
          Suresh Srinivas added a comment -

          +1 for the patch.

          Show
          Suresh Srinivas added a comment - +1 for the patch.
          Tsz Wo Nicholas Sze made changes -
          Summary FsShell put using WebHdfsFileSystem goes OOM when file size is big Create files with WebHdfsFileSystem goes OOM when file size is big
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Revised the summary since this was not specific to "fs -put".

          Show
          Tsz Wo Nicholas Sze added a comment - Revised the summary since this was not specific to "fs -put".
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The patch only changes WebHdfsFileSystem and adds a new test. The failed tests are not related.

          Show
          Tsz Wo Nicholas Sze added a comment - The patch only changes WebHdfsFileSystem and adds a new test. The failed tests are not related.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12537802/h3696_20120724.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.TestReplication
          org.apache.hadoop.hdfs.TestDatanodeBlockScanner
          org.apache.hadoop.hdfs.TestPersistBlocks

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2901//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2901//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12537802/h3696_20120724.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestReplication org.apache.hadoop.hdfs.TestDatanodeBlockScanner org.apache.hadoop.hdfs.TestPersistBlocks +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2901//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2901//console This message is automatically generated.
          Tsz Wo Nicholas Sze made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment h3696_20120724.patch [ 12537802 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          h3696_20120724.patch: add setChunkedStreamingMode(32kB).

          I tried several chunk sizes for writing 300MB files. 32kB was the best in my test.

          Chunk size 1st 2nd
          4kB 3.95MB/s 3.95MB/s
          16kB 7.81MB/s 7.70MB/s
          24kB 12.58MB/s 12.29MB/s
          32kB 14.15MB/s 14.28MB/s
          48kB 14.25MB/s 13.29MB/s
          64kB 13.65MB/s 13.57MB/s
          128kB 13.94MB/s 13.15MB/s
          1MB 13.11MB/s 13.45MB/s
          Show
          Tsz Wo Nicholas Sze added a comment - h3696_20120724.patch: add setChunkedStreamingMode(32kB). I tried several chunk sizes for writing 300MB files. 32kB was the best in my test. Chunk size 1st 2nd 4kB 3.95MB/s 3.95MB/s 16kB 7.81MB/s 7.70MB/s 24kB 12.58MB/s 12.29MB/s 32kB 14.15MB/s 14.28MB/s 48kB 14.25MB/s 13.29MB/s 64kB 13.65MB/s 13.57MB/s 128kB 13.94MB/s 13.15MB/s 1MB 13.11MB/s 13.45MB/s
          Tsz Wo Nicholas Sze made changes -
          Assignee Jing Zhao [ jingzhao ] Tsz Wo (Nicholas), SZE [ szetszwo ]
          Tsz Wo Nicholas Sze made changes -
          Assignee Jing Zhao [ jingzhao ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue is part of HDFS-3667 [ HDFS-3667 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue relates to HDFS-3671 [ HDFS-3671 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue is part of HDFS-3667 [ HDFS-3667 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I also noticed this problem when I tested WebHdfsFileSystem with 3GB files in HDFS-3671. After added HttpURLConnection.setChunkedStreamingMode(..), the test ran well. Since it is only an one-line change, I will add it with the retry patch (HDFS-3667).

          Show
          Tsz Wo Nicholas Sze added a comment - I also noticed this problem when I tested WebHdfsFileSystem with 3GB files in HDFS-3671 . After added HttpURLConnection.setChunkedStreamingMode(..), the test ran well. Since it is only an one-line change, I will add it with the retry patch ( HDFS-3667 ).
          Kihwal Lee made changes -
          Field Original Value New Value
          Description When dong "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes OOM if the file size is large. When I tested, 20MB files were fine, but 200MB didn't work.

          I also tried reading a large file by issuing "-cat" and piping to a slow sink in order to force buffering. The read path didn't have this problem. The memory consumption stayed the same regardless of progress.
          When doing "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes OOM if the file size is large. When I tested, 20MB files were fine, but 200MB didn't work.

          I also tried reading a large file by issuing "-cat" and piping to a slow sink in order to force buffering. The read path didn't have this problem. The memory consumption stayed the same regardless of progress.
          Hide
          Kihwal Lee added a comment -

          The map heap

          The max heap

          For reading of 1G piping into a sink that is consuming data at 10 KB/s, VSZ stayed at 547420 KB and RSZ 87540 KB. It doesn't go OOM but the VM size seems rather big.

          Show
          Kihwal Lee added a comment - The map heap The max heap For reading of 1G piping into a sink that is consuming data at 10 KB/s, VSZ stayed at 547420 KB and RSZ 87540 KB. It doesn't go OOM but the VM size seems rather big.
          Hide
          Kihwal Lee added a comment -

          The following stack trace is from doing copyFromLocal with 140MB file. The map heap is 1G (-Xmx1000m) in the client side.

          $ hadoop fs -copyFromLocal /tmp/xxx140m webhdfs://my.server.blah:50070/user/kihwal/xxx
          Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
                  at java.util.Arrays.copyOf(Arrays.java:2786)
                  at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
                  at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:61)
                  at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
                  at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54)
                  at java.io.DataOutputStream.write(DataOutputStream.java:90)
                  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80)
                  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
                  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
                  at org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:240)
                  at org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:219)
                  at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:165)
                  at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:150)
                  at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
                  at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
                  at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:145)
                  at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
                  at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
                  at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:122)
                  at org.apache.hadoop.fs.shell.CopyCommands$Put.processArguments(CopyCommands.java:204)
                  at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
                  at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
                  at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
                  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
                  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
                  at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
          
          Show
          Kihwal Lee added a comment - The following stack trace is from doing copyFromLocal with 140MB file. The map heap is 1G (-Xmx1000m) in the client side. $ hadoop fs -copyFromLocal /tmp/xxx140m webhdfs://my.server.blah:50070/user/kihwal/xxx Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2786) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94) at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:61) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54) at java.io.DataOutputStream.write(DataOutputStream.java:90) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112) at org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:240) at org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:219) at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:165) at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:150) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278) at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:145) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244) at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:122) at org.apache.hadoop.fs.shell.CopyCommands$Put.processArguments(CopyCommands.java:204) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190) at org.apache.hadoop.fs.shell.Command.run(Command.java:154) at org.apache.hadoop.fs.FsShell.run(FsShell.java:254) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
          Kihwal Lee created issue -

            People

            • Assignee:
              Tsz Wo Nicholas Sze
              Reporter:
              Kihwal Lee
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development