Hadoop Common
  1. Hadoop Common
  2. HADOOP-6833

IPC leaks call parameters when exceptions thrown

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.20.2, 0.21.0, 0.22.0
    • Fix Version/s: 0.20.3, 0.20.205.0, 0.21.1, 0.22.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      HADOOP-6498 moved the calls.remove() call lower into the SUCCESS clause of receiveResponse(), but didn't put a similar calls.remove into the ERROR clause. So, any RPC call that throws an exception ends up orphaning the Call object in the connection's "calls" hashtable. This prevents cleanup of the connection and is a memory leak for the call parameters.

      1. hadoop-6833-20.patch
        0.6 kB
        Eli Collins
      2. hadoop-6833.txt
        0.6 kB
        Todd Lipcon
      3. hadoop-6833.txt
        3 kB
        Todd Lipcon

        Issue Links

          Activity

          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12447656/hadoop-6833.txt
          against trunk revision 956710.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          -1 javac. The applied patch generated 1017 javac compiler warnings (more than the trunk's current 1013 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/590/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/590/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/590/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/590/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447656/hadoop-6833.txt against trunk revision 956710. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. -1 javac. The applied patch generated 1017 javac compiler warnings (more than the trunk's current 1013 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/590/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/590/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/590/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/590/console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          I think the "new warnings" are because I used UTF8 in the test, and UTF8 is deprecated. I used it to follow the pattern elsewhere in the test. Let me know if I should change the test case to use something like Text or LongWritable instead.

          Show
          Todd Lipcon added a comment - I think the "new warnings" are because I used UTF8 in the test, and UTF8 is deprecated. I used it to follow the pattern elsewhere in the test. Let me know if I should change the test case to use something like Text or LongWritable instead.
          Hide
          Eli Collins added a comment -

          Hey Todd,

          Looks like the test added in this patch is racy, it failed for me after I looped it for a couple minutes. When System.gc returns it just means the gc has made a best effort to run, not that it actually has, so in the cases where the gc hasn't run the test will fail because myArg was not reaped.

          The change itself looks correct. Since this is a one-liner and memory leaks like this are hard to test I'd be OK with a new patch that just fixes the bug as long as it passes the existing RPC tests. Any other suggestions on how to add test coverage of course are welcome, doesn't seem worth eg adding a getter that returns the number of calls.

          Thanks,
          Eli

          Show
          Eli Collins added a comment - Hey Todd, Looks like the test added in this patch is racy, it failed for me after I looped it for a couple minutes. When System.gc returns it just means the gc has made a best effort to run, not that it actually has, so in the cases where the gc hasn't run the test will fail because myArg was not reaped. The change itself looks correct. Since this is a one-liner and memory leaks like this are hard to test I'd be OK with a new patch that just fixes the bug as long as it passes the existing RPC tests. Any other suggestions on how to add test coverage of course are welcome, doesn't seem worth eg adding a getter that returns the number of calls. Thanks, Eli
          Hide
          Todd Lipcon added a comment -

          Here's the same patch minus the unit test.

          As for manual/cluster testing, we've been running with this fix in HBase's copy of RPC for a couple months. We used to eventually run out of RAM, heap analysis showed this was the problem. This fix got rid of the issue.

          Show
          Todd Lipcon added a comment - Here's the same patch minus the unit test. As for manual/cluster testing, we've been running with this fix in HBase's copy of RPC for a couple months. We used to eventually run out of RAM, heap analysis showed this was the problem. This fix got rid of the issue.
          Hide
          Eli Collins added a comment -

          +1

          Here are the test-patch results. Not sure why it -1 javadoc, ant javadoc|javadoc-dev are clean.

               [exec] -1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
               [exec]                         Please justify why no new tests are needed for this patch.
               [exec]                         Also please list what manual steps were performed to verify this patch.
               [exec] 
               [exec]     -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          
          Show
          Eli Collins added a comment - +1 Here are the test-patch results. Not sure why it -1 javadoc, ant javadoc|javadoc-dev are clean. [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          Hide
          Eli Collins added a comment -

          Attaching patch for branch-0.20.

          Show
          Eli Collins added a comment - Attaching patch for branch-0.20.
          Hide
          Eli Collins added a comment -

          I committed this to 20, 21 and trunk. Thanks Todd!

          Show
          Eli Collins added a comment - I committed this to 20, 21 and trunk. Thanks Todd!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #366 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk-Commit/366/)
          HADOOP-6833. IPC leaks call parameters when exceptions thrown. Contributed by Todd Lipcon.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #366 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk-Commit/366/ ) HADOOP-6833 . IPC leaks call parameters when exceptions thrown. Contributed by Todd Lipcon.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #435 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk/435/)
          HADOOP-6833. IPC leaks call parameters when exceptions thrown. Contributed by Todd Lipcon.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #435 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk/435/ ) HADOOP-6833 . IPC leaks call parameters when exceptions thrown. Contributed by Todd Lipcon.
          Hide
          Owen O'Malley added a comment -

          Let's fix this in 205 too.

          Show
          Owen O'Malley added a comment - Let's fix this in 205 too.
          Hide
          Luke Lu added a comment -

          Thanks for the +1 for branch-0.20-security, Owen, I'll cherry-pick the change and run some test.

          Show
          Luke Lu added a comment - Thanks for the +1 for branch-0.20-security, Owen, I'll cherry-pick the change and run some test.
          Hide
          Luke Lu added a comment -

          Committed in branch-0.20-security, where 205 will branch from (old 205 was just deleted).

          Show
          Luke Lu added a comment - Committed in branch-0.20-security, where 205 will branch from (old 205 was just deleted).
          Hide
          Matt Foley added a comment -

          Closed upon release of 0.20.205.0

          Show
          Matt Foley added a comment - Closed upon release of 0.20.205.0

            People

            • Assignee:
              Todd Lipcon
              Reporter:
              Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development