Hadoop Common
  1. Hadoop Common
  2. HADOOP-6577

IPC server response buffer reset threshold should be configurable

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      Add hidden configuration option "ipc.server.max.response.size" to change the default 1 MB, the maximum size when large IPC handler response buffer is reset.
      Show
      Add hidden configuration option "ipc.server.max.response.size" to change the default 1 MB, the maximum size when large IPC handler response buffer is reset.

      Description

      In HDFS-6460, the response buffers in o.a.h.ipc.Server.Handler was reset when the buffer grows beyond max size of 1MB. This frees heap from large responses occupying it. This max response size limit should be configurable. Details in subsequent comment.

      1. hadoop-6577.patch
        2 kB
        Suresh Srinivas
      2. hadoop-6577.3.patch
        4 kB
        Suresh Srinivas
      3. hadoop-6577.2.rel20.patch
        3 kB
        Suresh Srinivas
      4. hadoop-6577.2.patch
        4 kB
        Suresh Srinivas
      5. hadoop-6577.2.patch
        4 kB
        Suresh Srinivas
      6. hadoop-6577.1.patch
        4 kB
        Suresh Srinivas

        Issue Links

          Activity

          Hide
          Suresh Srinivas added a comment -

          When a high frequency of requests are made to namenode, each resulting in a response of size greater than 1MB, a lot of garbage is created on the heap. This could result in tenured heap getting filled up very fast and trigger full GC. Full GC results in long stop the world pauses, affecting the applications using HDFS.

          In one of the instances observed on the production cluster, an application repeatedly made list status calls, each with response sizes ranging from 3 to 5MB. This resulted in full GC, which could have been avoided by setting the max response size to 10MB.

          More permanent solution for this problem is to ensure an operation that results in large response (listStatus) are broken in to multi-step smaller operations. This will be addressed in a separate jira. In the interim I propose adding a hidden config param that could be used for setting the max resp buffer size.

          Show
          Suresh Srinivas added a comment - When a high frequency of requests are made to namenode, each resulting in a response of size greater than 1MB, a lot of garbage is created on the heap. This could result in tenured heap getting filled up very fast and trigger full GC. Full GC results in long stop the world pauses, affecting the applications using HDFS. In one of the instances observed on the production cluster, an application repeatedly made list status calls, each with response sizes ranging from 3 to 5MB. This resulted in full GC, which could have been avoided by setting the max response size to 10MB. More permanent solution for this problem is to ensure an operation that results in large response (listStatus) are broken in to multi-step smaller operations. This will be addressed in a separate jira. In the interim I propose adding a hidden config param that could be used for setting the max resp buffer size.
          Hide
          Suresh Srinivas added a comment -

          Patch introduces new parameter "ipc.server.max.response.size" used for configuring the max response buffer size.

          Show
          Suresh Srinivas added a comment - Patch introduces new parameter "ipc.server.max.response.size" used for configuring the max response buffer size.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12436280/hadoop-6577.patch
          against trunk revision 911646.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/14/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/14/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/14/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/14/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12436280/hadoop-6577.patch against trunk revision 911646. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/14/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/14/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/14/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/14/console This message is automatically generated.
          Hide
          Arun C Murthy added a comment -

          Minor nit, please add a variable to track the config knob:

          static final String HADOOP_RPC_RESPONSE_SIZE = "ipc.server.max.response.size";
          

          or some such.

          Show
          Arun C Murthy added a comment - Minor nit, please add a variable to track the config knob: static final String HADOOP_RPC_RESPONSE_SIZE = "ipc.server.max.response.size" ; or some such.
          Hide
          Suresh Srinivas added a comment -

          New patch incorporating comments from Arun.

          Show
          Suresh Srinivas added a comment - New patch incorporating comments from Arun.
          Hide
          Suresh Srinivas added a comment -

          Attaching patch for 0.20 branch.

          Show
          Suresh Srinivas added a comment - Attaching patch for 0.20 branch.
          Hide
          Suresh Srinivas added a comment -

          Attaching the patch again for hudson to pick up instead of branch 0.20 version.

          Show
          Suresh Srinivas added a comment - Attaching the patch again for hudson to pick up instead of branch 0.20 version.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12436400/hadoop-6577.2.patch
          against trunk revision 911748.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/16/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12436400/hadoop-6577.2.patch against trunk revision 911748. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/16/console This message is automatically generated.
          Hide
          Arun C Murthy added a comment -

          +1

          Show
          Arun C Murthy added a comment - +1
          Hide
          Suresh Srinivas added a comment -

          Attaching a new patch that applies with the latest trunk changes.

          Show
          Suresh Srinivas added a comment - Attaching a new patch that applies with the latest trunk changes.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12436407/hadoop-6577.3.patch
          against trunk revision 911748.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/17/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/17/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/17/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/17/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12436407/hadoop-6577.3.patch against trunk revision 911748. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/17/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/17/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/17/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h1.grid.sp2.yahoo.net/17/console This message is automatically generated.
          Hide
          Suresh Srinivas added a comment -

          I committed the patch.

          Show
          Suresh Srinivas added a comment - I committed the patch.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #178 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/178/)
          . Add hidden configuration option "ipc.server.max.response.size" to change the default 1 MB, the maximum size when large IPC handler response buffer is reset. Contributed by Suresh Srinivas.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #178 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/178/ ) . Add hidden configuration option "ipc.server.max.response.size" to change the default 1 MB, the maximum size when large IPC handler response buffer is reset. Contributed by Suresh Srinivas.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #255 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/255/)
          . Add hidden configuration option "ipc.server.max.response.size" to change the default 1 MB, the maximum size when large IPC handler response buffer is reset. Contributed by Suresh Srinivas.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #255 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/255/ ) . Add hidden configuration option "ipc.server.max.response.size" to change the default 1 MB, the maximum size when large IPC handler response buffer is reset. Contributed by Suresh Srinivas.
          Hide
          dhruba borthakur added a comment -

          In your setup, are you going to be setting ipc.server.max.response.size to 10MB?

          Show
          dhruba borthakur added a comment - In your setup, are you going to be setting ipc.server.max.response.size to 10MB?
          Hide
          Koji Noguchi added a comment -

          More permanent solution for this problem is to ensure an operation that results in large response (listStatus) are broken in to multi-step smaller operations.

          Hairong created HDFS-985.

          Show
          Koji Noguchi added a comment - More permanent solution for this problem is to ensure an operation that results in large response (listStatus) are broken in to multi-step smaller operations. Hairong created HDFS-985 .
          Hide
          Suresh Srinivas added a comment -

          @dhruba - In your setup, are you going to be setting ipc.server.max.response.size to 10MB?
          depending on the production cluster response sizes and patterns, we may have to set the resp buffer size to 10MB. With 100 handlers, this is 1GB hit on the heap.

          Show
          Suresh Srinivas added a comment - @dhruba - In your setup, are you going to be setting ipc.server.max.response.size to 10MB? depending on the production cluster response sizes and patterns, we may have to set the resp buffer size to 10MB. With 100 handlers, this is 1GB hit on the heap.

            People

            • Assignee:
              Suresh Srinivas
              Reporter:
              Suresh Srinivas
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development