Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-596

Memory leak in libhdfs: hdfsFreeFileInfo() in libhdfs does not free memory for mOwner and mGroup

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.20.1
    • Fix Version/s: 0.20.2, 0.21.0
    • Component/s: fuse-dfs
    • Labels:
      None
    • Environment:

      Linux hadoop-001 2.6.28-14-server #47-Ubuntu SMP Sat Jul 25 01:18:34 UTC 2009 i686 GNU/Linux. Namenode with 1GB memory.

    • Hadoop Flags:
      Reviewed
    • Release Note:
      Memory leak in function hdfsFreeFileInfo in libhdfs. This bug affects fuse-dfs severely.
    • Tags:
      memory leak libhdfs hdfsFreeFileInfo fuse-dfs

      Description

      This bugs affects fuse-dfs severely. In my test, about 1GB memory were exhausted and the fuse-dfs mount directory was disconnected after writing 14000 files. This bug is related to the memory leak problem of this issue: http://issues.apache.org/jira/browse/HDFS-420.

      The bug can be fixed very easily. In function hdfsFreeFileInfo() in file hdfs.c (under c++/libhdfs/) change code block:

      //Free the mName
      int i;
      for (i=0; i < numEntries; ++i) {
      if (hdfsFileInfo[i].mName)

      { free(hdfsFileInfo[i].mName); }
      }

      into:

      // free mName, mOwner and mGroup
      int i;
      for (i=0; i < numEntries; ++i) {
      if (hdfsFileInfo[i].mName) { free(hdfsFileInfo[i].mName); }

      if (hdfsFileInfo[i].mOwner)

      { free(hdfsFileInfo[i].mOwner); }

      if (hdfsFileInfo[i].mGroup)

      { free(hdfsFileInfo[i].mGroup); }

      }

      I am new to Jira and haven't figured out a way to generate .patch file yet. Could anyone help me do that so that others can commit the changes into the code base. Thanks!

      1. HDFS-596.patch
        0.7 kB
        Zhang Bingjun

        Issue Links

          Activity

          Hide
          Konstantin Boudnik added a comment -

          I'm sure you're familiar with this http://wiki.apache.org/hadoop/HowToContribute

          Show
          Konstantin Boudnik added a comment - I'm sure you're familiar with this http://wiki.apache.org/hadoop/HowToContribute
          Hide
          Zhang Bingjun added a comment -

          Please help review the patch and commit it if you think everything is ok. Thanks!

          Show
          Zhang Bingjun added a comment - Please help review the patch and commit it if you think everything is ok. Thanks!
          Hide
          Zhang Bingjun added a comment -

          Thanks to Boudnik!

          Show
          Zhang Bingjun added a comment - Thanks to Boudnik!
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12418696/HDFS-596.patch
          against trunk revision 811493.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/12/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12418696/HDFS-596.patch against trunk revision 811493. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/12/console This message is automatically generated.
          Hide
          Christian Kunz added a comment -

          We just hit this bug big time. Our applications ran out of memory.
          We will have to apply this patch ourselves.
          Why was this issue not declared as blocker?
          This is a <b>serious</b> memory issue introduced between hadoop-0.18 and hadoop-0.20.

          Show
          Christian Kunz added a comment - We just hit this bug big time. Our applications ran out of memory. We will have to apply this patch ourselves. Why was this issue not declared as blocker? This is a <b>serious</b> memory issue introduced between hadoop-0.18 and hadoop-0.20.
          Hide
          Eli Collins added a comment -

          Hey Christian,

          Thanks for bumping the priority, and sorry this bit you. I filed HDFS-773 to add memory leaks to the existing libhdfs unit testing.

          The uploaded patch looks good me. I confirmed that it applies to 20.1 and the unit tests pass. I also tested that it fixes the leaks in hdfsListDirectory by writing a program that calls it (and hdfsFreeFileInfo) in a tight loop on a directory with a 1000 files; after applying the patch memory usage no longer grows unbounded. I've checked it into the next CDH release.

          I also confirmed that the patch applies, builds, and hdfs_test runs on trunk (though it fails, that's another jira).

          Let's check this patch into 20.2 and trunk.

          Thanks,
          Eli

          Show
          Eli Collins added a comment - Hey Christian, Thanks for bumping the priority, and sorry this bit you. I filed HDFS-773 to add memory leaks to the existing libhdfs unit testing. The uploaded patch looks good me. I confirmed that it applies to 20.1 and the unit tests pass. I also tested that it fixes the leaks in hdfsListDirectory by writing a program that calls it (and hdfsFreeFileInfo) in a tight loop on a directory with a 1000 files; after applying the patch memory usage no longer grows unbounded. I've checked it into the next CDH release. I also confirmed that the patch applies, builds, and hdfs_test runs on trunk (though it fails, that's another jira). Let's check this patch into 20.2 and trunk. Thanks, Eli
          Hide
          dhruba borthakur added a comment -

          Triggering HadoopQA tests (mostly a formality for this one).

          +1, code looks good. I am going to check this into 0.20, 0.21 and trunk in another day.

          @Eli: is it possible to check in your manual test as a unit tets for libhdfs... it might be useful to catch memory leaks in future.

          Show
          dhruba borthakur added a comment - Triggering HadoopQA tests (mostly a formality for this one). +1, code looks good. I am going to check this into 0.20, 0.21 and trunk in another day. @Eli: is it possible to check in your manual test as a unit tets for libhdfs... it might be useful to catch memory leaks in future.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12418696/HDFS-596.patch
          against trunk revision 835958.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/113/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/113/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/113/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/113/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12418696/HDFS-596.patch against trunk revision 835958. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/113/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/113/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/113/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/113/console This message is automatically generated.
          Hide
          Eli Collins added a comment -

          @Dhruba I filed HDFS-773 for testing memory leaks in general, the program I wrote just calls hdfsListDirectory and hdfsFreeFileInfo in a loop and I watched memory consumption using top, it's not a unit test.

          Show
          Eli Collins added a comment - @Dhruba I filed HDFS-773 for testing memory leaks in general, the program I wrote just calls hdfsListDirectory and hdfsFreeFileInfo in a loop and I watched memory consumption using top, it's not a unit test.
          Hide
          Eli Collins added a comment -

          Here's the program in case you want to re-test.

          #include "hdfs.h" 
          
          int main(void) {
            hdfsFS fs;
            hdfsFileInfo *infos;
            int numInfos;
          
            fs = hdfsConnect("localhost", 8020);
            if(!fs) {
              perror("hdfsConnect");
              return -1;
            } 
          
            while (1) {
              if ((infos = hdfsListDirectory(fs, "/", &numInfos)) != NULL) {
                hdfsFreeFileInfo(infos, numInfos);
              } else {
                if (errno) {
                  perror("hdfsListDirectory");
                  return -1;
                }
              }
            }
          
            hdfsDisconnect(fs);
            return 0;
          }
          
          Show
          Eli Collins added a comment - Here's the program in case you want to re-test. #include "hdfs.h" int main(void) { hdfsFS fs; hdfsFileInfo *infos; int numInfos; fs = hdfsConnect( "localhost" , 8020); if (!fs) { perror( "hdfsConnect" ); return -1; } while (1) { if ((infos = hdfsListDirectory(fs, "/" , &numInfos)) != NULL) { hdfsFreeFileInfo(infos, numInfos); } else { if (errno) { perror( "hdfsListDirectory" ); return -1; } } } hdfsDisconnect(fs); return 0; }
          Hide
          dhruba borthakur added a comment -

          I just committed this. Thanks Eli & Zhang.

          Show
          dhruba borthakur added a comment - I just committed this. Thanks Eli & Zhang.
          Hide
          dhruba borthakur added a comment -

          I just committed this. Thanks Eli ad Zhang.

          Show
          dhruba borthakur added a comment - I just committed this. Thanks Eli ad Zhang.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #111 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/111/)
          . Fix memory leak in hdfsFreeFileInfo() for libhdfs.
          (Zhang Bingjun via dhruba)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #111 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/111/ ) . Fix memory leak in hdfsFreeFileInfo() for libhdfs. (Zhang Bingjun via dhruba)
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #143 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/143/)
          . Fix memory leak in hdfsFreeFileInfo() for libhdfs.
          (Zhang Bingjun via dhruba)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #143 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/143/ ) . Fix memory leak in hdfsFreeFileInfo() for libhdfs. (Zhang Bingjun via dhruba)
          Hide
          Hudson added a comment -

          Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #76 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/76/)
          . Fix memory leak in hdfsFreeFileInfo() for libhdfs.
          (Zhang Bingjun via dhruba)

          Show
          Hudson added a comment - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #76 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/76/ ) . Fix memory leak in hdfsFreeFileInfo() for libhdfs. (Zhang Bingjun via dhruba)
          Hide
          Hudson added a comment -

          Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #115 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/115/)

          Show
          Hudson added a comment - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #115 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/115/ )

            People

            • Assignee:
              Zhang Bingjun
              Reporter:
              Zhang Bingjun
            • Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 0.5h
                0.5h
                Remaining:
                Remaining Estimate - 0.5h
                0.5h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development