Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-10525

Allow the client to use a different thread for writing to ease interrupt

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.99.0
    • 0.99.0, hbase-10070
    • Client
    • None
    • Reviewed
    • Hide
      If hbase.ipc.client.allowsInterrupt is set to true (default being false), the writes are performed in a different thread. This workarounds a Java limitation with interruptions and i/o; and limits the impact of interrupting a client call. It's strongly recommended to activate this parameter when using tables with multiple replicas.
      Show
      If hbase.ipc.client.allowsInterrupt is set to true (default being false), the writes are performed in a different thread. This workarounds a Java limitation with interruptions and i/o; and limits the impact of interrupting a client call. It's strongly recommended to activate this parameter when using tables with multiple replicas.

    Description

      This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost.

      I will attach a doc with a more detailed explanation.

      This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default.

      Attachments

        1. 10525.v1.patch
          9 kB
          Nicolas Liochon
        2. HBaseclient-EventualConsistency.pdf
          160 kB
          Nicolas Liochon
        3. 10525.v2.patch
          17 kB
          Nicolas Liochon
        4. 10525.v3.patch
          17 kB
          Nicolas Liochon
        5. 10525.v4.patch
          29 kB
          Nicolas Liochon
        6. 10525.v5.patch
          28 kB
          Nicolas Liochon
        7. 10525.v6.patch
          37 kB
          Nicolas Liochon
        8. 10525.v7.patch
          40 kB
          Nicolas Liochon

        Issue Links

          Activity

            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12628768/10525.v1.patch
            against trunk revision .
            ATTACHMENT ID: 12628768

            +1 @author. The patch does not contain any @author tags.

            -1 tests included. The patch doesn't appear to include any new or modified tests.
            Please justify why no new tests are needed for this patch.
            Also please list what manual steps were performed to verify this patch.

            +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

            +1 hadoop1.1. The patch compiles against the hadoop 1.1 profile.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 lineLengths. The patch does not introduce lines longer than 100

            -1 site. The patch appears to cause mvn site goal to fail.

            -1 core tests. The patch failed these unit tests:
            org.apache.hadoop.hbase.ipc.TestIPC

            Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//testReport/
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
            Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12628768/10525.v1.patch against trunk revision . ATTACHMENT ID: 12628768 +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop1.1 . The patch compiles against the hadoop 1.1 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.ipc.TestIPC Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8682//console This message is automatically generated.

            TestIPC uses a rpcTimeout of zero. Patch updated.

            nkeywal Nicolas Liochon added a comment - TestIPC uses a rpcTimeout of zero. Patch updated.
            stack Michael Stack added a comment -

            DEATH_PILL smile

            Something wrong here N:

            + * We can either write the call directly on the socket, either delegate this to
            + * a different threads. Ultimately, this will allow to have a set of writer & reader
            + * for the whole cluster.
            + * Using a different thread allows the client thread to be interrupted w/o any impact. If
            + * the client threads writes on the sockets, an interruption will close this socket.

            I dont follow the above exactly? On the face of it, yet another thread in client seems crazy! But your reasoning above seems good

            ThreadCallSender should be CallRunner or Call Executor? Or CallNanny Can it be a static class?

            Imlement Closeable since it has a close? Not necessary at all but...

            No spaces in here: + setName(name + " - writer");

            What will 'name' be?

            + callsToWrite.remove(cts); Do you have to cancel the future itself too?

            What happens to outstanding 'calls' when cancel called? How they stopped?

            Can you write up a big class comment on the mechanism you are instituting here so it is clear what is going on both for reviewers and for the folks who come along afterward trying to make sense of it all. Good on you nkeywal

            stack Michael Stack added a comment - DEATH_PILL smile Something wrong here N: + * We can either write the call directly on the socket, either delegate this to + * a different threads. Ultimately, this will allow to have a set of writer & reader + * for the whole cluster. + * Using a different thread allows the client thread to be interrupted w/o any impact. If + * the client threads writes on the sockets, an interruption will close this socket. I dont follow the above exactly? On the face of it, yet another thread in client seems crazy! But your reasoning above seems good ThreadCallSender should be CallRunner or Call Executor? Or CallNanny Can it be a static class? Imlement Closeable since it has a close? Not necessary at all but... No spaces in here: + setName(name + " - writer"); What will 'name' be? + callsToWrite.remove(cts); Do you have to cancel the future itself too? What happens to outstanding 'calls' when cancel called? How they stopped? Can you write up a big class comment on the mechanism you are instituting here so it is clear what is going on both for reviewers and for the folks who come along afterward trying to make sense of it all. Good on you nkeywal

            I dont follow the above exactly?

            It's basically a Java issue. Interrupting a process that is doing some i/o is complex, and difficult to do in the same way on all platforms. So the JVM guys went for the solution that was implementable on all platforms: close the socket when there is an interrupt.

            What will 'name' be?

            The same name as the reader, postfixed with 'writer').

            yet another thread in client seems crazy

            Yeah. But I think that we're not far from being able to put a single pool for all the region servers, instead of 1 thread per region server as today (and with this new option on). There is this nasty rpcTimeout, and then we're clean enough to do the change I think. It's something I would like to do with a direct buffer write. Ir's more or less mandatory to scale on large sized cluster imho.

            ThreadCallSender should be CallRunner or Call Executor

            I don't know. It does not execute much, it just writes on the socket. It uses the output stream of the connection, so it has to be an instance if I want to limit the impact on the existing code.

            + callsToWrite.remove(cts); Do you have to cancel the future itself too?

            setting call.done allows the reading thread to skip this part
            removing it from the callsToWrite list allows us to not send the call to the remote server at all.
            So we do both.
            Now if the call was sent to the server, it's not stopped (at least with this patch , just that we will skip the answer. Canceling on the server is interesting, but on the other hand it's another rpc call.

            Imlement Closeable since it has a close? Not necessary at all but...

            agreed. Will do.

            Can you write up a big class comment on the mechanism you are instituting here so it is clear what is going on both for reviewers and for the folks who come along afterward trying to make sense of it all

            Sure. I'm polishing a doc as well that I will put in this jira.

            nkeywal Nicolas Liochon added a comment - I dont follow the above exactly? It's basically a Java issue. Interrupting a process that is doing some i/o is complex, and difficult to do in the same way on all platforms. So the JVM guys went for the solution that was implementable on all platforms: close the socket when there is an interrupt. What will 'name' be? The same name as the reader, postfixed with 'writer'). yet another thread in client seems crazy Yeah. But I think that we're not far from being able to put a single pool for all the region servers, instead of 1 thread per region server as today (and with this new option on). There is this nasty rpcTimeout, and then we're clean enough to do the change I think. It's something I would like to do with a direct buffer write. Ir's more or less mandatory to scale on large sized cluster imho. ThreadCallSender should be CallRunner or Call Executor I don't know. It does not execute much, it just writes on the socket. It uses the output stream of the connection, so it has to be an instance if I want to limit the impact on the existing code. + callsToWrite.remove(cts); Do you have to cancel the future itself too? setting call.done allows the reading thread to skip this part removing it from the callsToWrite list allows us to not send the call to the remote server at all. So we do both. Now if the call was sent to the server, it's not stopped (at least with this patch , just that we will skip the answer. Canceling on the server is interesting, but on the other hand it's another rpc call. Imlement Closeable since it has a close? Not necessary at all but... agreed. Will do. Can you write up a big class comment on the mechanism you are instituting here so it is clear what is going on both for reviewers and for the folks who come along afterward trying to make sense of it all Sure. I'm polishing a doc as well that I will put in this jira.
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12628775/10525.v2.patch
            against trunk revision .
            ATTACHMENT ID: 12628775

            +1 @author. The patch does not contain any @author tags.

            -1 tests included. The patch doesn't appear to include any new or modified tests.
            Please justify why no new tests are needed for this patch.
            Also please list what manual steps were performed to verify this patch.

            +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

            +1 hadoop1.1. The patch compiles against the hadoop 1.1 profile.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 lineLengths. The patch does not introduce lines longer than 100

            -1 site. The patch appears to cause mvn site goal to fail.

            -1 core tests. The patch failed these unit tests:
            org.apache.hadoop.hbase.trace.TestHTraceHooks

            Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//testReport/
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
            Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12628775/10525.v2.patch against trunk revision . ATTACHMENT ID: 12628775 +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop1.1 . The patch compiles against the hadoop 1.1 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.trace.TestHTraceHooks Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8684//console This message is automatically generated.

            v2
            comments above taken into account
            htace hooks added (is that the right way to do that, eclark ?)
            optimization / race condition fixed in cleanupCalls
            simplification in waitForWork

            I will push a v3 with the option set to false to get an hadoop qa run with the two settings

            nkeywal Nicolas Liochon added a comment - v2 comments above taken into account htace hooks added (is that the right way to do that, eclark ?) optimization / race condition fixed in cleanupCalls simplification in waitForWork I will push a v3 with the option set to false to get an hadoop qa run with the two settings
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12629577/10525.v2.patch
            against trunk revision .
            ATTACHMENT ID: 12629577

            +1 @author. The patch does not contain any @author tags.

            -1 tests included. The patch doesn't appear to include any new or modified tests.
            Please justify why no new tests are needed for this patch.
            Also please list what manual steps were performed to verify this patch.

            +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

            +1 hadoop1.1. The patch compiles against the hadoop 1.1 profile.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 lineLengths. The patch does not introduce lines longer than 100

            +1 site. The mvn site goal succeeds with this patch.

            -1 core tests. The patch failed these unit tests:
            org.apache.hadoop.hbase.util.TestHBaseFsck

            Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//testReport/
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
            Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12629577/10525.v2.patch against trunk revision . ATTACHMENT ID: 12629577 +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop1.1 . The patch compiles against the hadoop 1.1 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.util.TestHBaseFsck Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8735//console This message is automatically generated.

            same as v2 excepted that the new behavior is disabled by default.
            It's what I would like to commit.

            nkeywal Nicolas Liochon added a comment - same as v2 excepted that the new behavior is disabled by default. It's what I would like to commit.
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12629790/10525.v3.patch
            against trunk revision .
            ATTACHMENT ID: 12629790

            +1 @author. The patch does not contain any @author tags.

            -1 tests included. The patch doesn't appear to include any new or modified tests.
            Please justify why no new tests are needed for this patch.
            Also please list what manual steps were performed to verify this patch.

            +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

            +1 hadoop1.1. The patch compiles against the hadoop 1.1 profile.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 lineLengths. The patch does not introduce lines longer than 100

            +1 site. The mvn site goal succeeds with this patch.

            -1 core tests. The patch failed these unit tests:
            org.apache.hadoop.hbase.util.TestHBaseFsck

            Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//testReport/
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
            Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12629790/10525.v3.patch against trunk revision . ATTACHMENT ID: 12629790 +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop1.1 . The patch compiles against the hadoop 1.1 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.util.TestHBaseFsck Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8747//console This message is automatically generated.
            enis Enis Soztutar added a comment -

            I was testing this, and it seems that there is an issue when the connection is closed (RS killed). it maybe that the CallSender can still accept more calls even after Connection is closed and shouldCloseConnection is set. In case CallSender.cleanUp() runs first, and clears the queue, any call added to the CallSeender queue will not get notification, thus will hang. I think we may need to sync on CallSender accepting Calls with with the Connection.out. Other ways might also be possible.
            Here is a dump where the RPC callee threads are just hanging forever:

            "htable-pool15-t2" daemon prio=10 tid=0x0000000001f2e800 nid=0x1bc3 in Object.wait() [0x00007fca48f72000]
               java.lang.Thread.State: TIMED_WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1435)
            	- locked <0x00000000bd6a0230> (a org.apache.hadoop.hbase.ipc.RpcClient$Call)
            	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1655)
            	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1713)
            	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:29300)
            	at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:125)
            	at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:53)
            	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:186)
            	at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:658)
            	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
            	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
            	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
            	at java.lang.Thread.run(Thread.java:722)
            
            "htable-pool23-t2" daemon prio=10 tid=0x00007fca4c5db800 nid=0x1bbb in Object.wait() [0x00007fca49779000]
               java.lang.Thread.State: TIMED_WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1435)
            	- locked <0x00000000bd6a0578> (a org.apache.hadoop.hbase.ipc.RpcClient$Call)
            	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1655)
            	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1713)
            	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:29300)
            	at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:125)
            	at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:53)
            	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:186)
            	at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:658)
            	at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:850)
            	at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:824)
            	at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.logAndResubmit(AsyncProcess.java:998)
            	at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveGlobalFailure(AsyncProcess.java:952)
            	at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1000(AsyncProcess.java:546)
            	at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:662)
            	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
            	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
            	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
            	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
            	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
            	at java.lang.Thread.run(Thread.java:722)
            
            "HBaseWriterThread_13" prio=10 tid=0x00007fca5cf68000 nid=0x1b95 in Object.wait() [0x00007fca4bda0000]
               java.lang.Thread.State: TIMED_WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	at org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1383)
            	- locked <0x00000000cc21a7e0> (a java.util.concurrent.atomic.AtomicLong)
            	at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1412)
            	at org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:1027)
            	at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1298)
            	at org.apache.hadoop.hbase.client.HTable.put(HTable.java:959)
            	at org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.insert(MultiThreadedWriter.java:143)
            	at org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.run(MultiThreadedWriter.java:108)
            
            enis Enis Soztutar added a comment - I was testing this, and it seems that there is an issue when the connection is closed (RS killed). it maybe that the CallSender can still accept more calls even after Connection is closed and shouldCloseConnection is set. In case CallSender.cleanUp() runs first, and clears the queue, any call added to the CallSeender queue will not get notification, thus will hang. I think we may need to sync on CallSender accepting Calls with with the Connection.out. Other ways might also be possible. Here is a dump where the RPC callee threads are just hanging forever: "htable-pool15-t2" daemon prio=10 tid=0x0000000001f2e800 nid=0x1bc3 in Object .wait() [0x00007fca48f72000] java.lang. Thread .State: TIMED_WAITING (on object monitor) at java.lang. Object .wait(Native Method) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1435) - locked <0x00000000bd6a0230> (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1655) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1713) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:29300) at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:125) at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:53) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:186) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:658) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang. Thread .run( Thread .java:722) "htable-pool23-t2" daemon prio=10 tid=0x00007fca4c5db800 nid=0x1bbb in Object .wait() [0x00007fca49779000] java.lang. Thread .State: TIMED_WAITING (on object monitor) at java.lang. Object .wait(Native Method) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1435) - locked <0x00000000bd6a0578> (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1655) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1713) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:29300) at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:125) at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:53) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:186) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:658) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:850) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:824) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.logAndResubmit(AsyncProcess.java:998) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveGlobalFailure(AsyncProcess.java:952) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1000(AsyncProcess.java:546) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:662) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang. Thread .run( Thread .java:722) "HBaseWriterThread_13" prio=10 tid=0x00007fca5cf68000 nid=0x1b95 in Object .wait() [0x00007fca4bda0000] java.lang. Thread .State: TIMED_WAITING (on object monitor) at java.lang. Object .wait(Native Method) at org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1383) - locked <0x00000000cc21a7e0> (a java.util.concurrent.atomic.AtomicLong) at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1412) at org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:1027) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1298) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:959) at org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.insert(MultiThreadedWriter.java:143) at org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.run(MultiThreadedWriter.java:108)

            Thanks a lot for the detailed analysis, Enis. v4 fixes it and adds a test.

            This scenario was ok already:
            (connection marked as close)
            reader close - cleanup expected reads
            writer close - cleanup calls to writes
            There is no possible race condition between the two, because we check the connection status in writeRequest. If you did the cleanup, it means that connectionClosed was true, so the writeRequest will fail.

            another scenario was NOT ok:
            new client get connection - connection is opened
            (connection marked as closed)
            reader close - cleanup expected reads
            writer close - cleanup calls to writes
            client thread add call to calls to write

            The solution is to add a check after we add the call, exactly as in the first scenario. This does not require another synchronization point.

            nkeywal Nicolas Liochon added a comment - Thanks a lot for the detailed analysis, Enis. v4 fixes it and adds a test. This scenario was ok already: (connection marked as close) reader close - cleanup expected reads writer close - cleanup calls to writes There is no possible race condition between the two, because we check the connection status in writeRequest. If you did the cleanup, it means that connectionClosed was true, so the writeRequest will fail. another scenario was NOT ok: new client get connection - connection is opened (connection marked as closed) reader close - cleanup expected reads writer close - cleanup calls to writes client thread add call to calls to write The solution is to add a check after we add the call, exactly as in the first scenario. This does not require another synchronization point.
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12630060/10525.v4.patch
            against trunk revision .
            ATTACHMENT ID: 12630060

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 3 new or modified tests.

            +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

            +1 hadoop1.1. The patch compiles against the hadoop 1.1 profile.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 lineLengths. The patch does not introduce lines longer than 100

            +1 site. The mvn site goal succeeds with this patch.

            -1 core tests. The patch failed these unit tests:
            org.apache.hadoop.hbase.client.TestHCM
            org.apache.hadoop.hbase.util.TestHBaseFsck

            Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//testReport/
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
            Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630060/10525.v4.patch against trunk revision . ATTACHMENT ID: 12630060 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop1.1 . The patch compiles against the hadoop 1.1 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.client.TestHCM org.apache.hadoop.hbase.util.TestHBaseFsck Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//console This message is automatically generated.

            conflict on the name for testHCM. Will fix.
            For Fsck, seems flaky these days. Message is: java.lang.AssertionError: expected: EXPIRED_TABLE_LOCK but was:UNKNOWN, NO_META_REGION, RS_CONNECT_FAILURE, RS_CONNECT_FAILURE, RS_CONNECT_FAILURE, EXPIRED_TABLE_LOCK at org.apache.hadoop.hbase.util.TestHBaseFsck.testCheckTableLocks (TestHBaseFsck.java:2072)

            nkeywal Nicolas Liochon added a comment - conflict on the name for testHCM. Will fix. For Fsck, seems flaky these days. Message is: java.lang.AssertionError: expected: EXPIRED_TABLE_LOCK but was:UNKNOWN, NO_META_REGION, RS_CONNECT_FAILURE, RS_CONNECT_FAILURE, RS_CONNECT_FAILURE, EXPIRED_TABLE_LOCK at org.apache.hadoop.hbase.util.TestHBaseFsck.testCheckTableLocks (TestHBaseFsck.java:2072)

            And the issue with TestHBaseFsck comes from the EnvironmentEdgeManager; the test changes the time, it impacts the RpcClient....

            nkeywal Nicolas Liochon added a comment - And the issue with TestHBaseFsck comes from the EnvironmentEdgeManager; the test changes the time, it impacts the RpcClient....
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12630090/10525.v5.patch
            against trunk revision .
            ATTACHMENT ID: 12630090

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 3 new or modified tests.

            +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

            +1 hadoop1.1. The patch compiles against the hadoop 1.1 profile.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 lineLengths. The patch does not introduce lines longer than 100

            +1 site. The mvn site goal succeeds with this patch.

            -1 core tests. The patch failed these unit tests:
            org.apache.hadoop.hbase.util.TestHBaseFsck

            Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//testReport/
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
            Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630090/10525.v5.patch against trunk revision . ATTACHMENT ID: 12630090 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop1.1 . The patch compiles against the hadoop 1.1 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.util.TestHBaseFsck Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//console This message is automatically generated.
            enis Enis Soztutar added a comment -

            Thanks Nicolas.

            The solution is to add a check after we add the call, exactly as in the first scenario. This does not require another synchronization point.

            Agreed. The patch seems to be fixing this.

            Is this sleep(1) intended, or left from debugging?

            -    connection.writeRequest(call, priority);
            +    Thread.sleep(1);
            
            enis Enis Soztutar added a comment - Thanks Nicolas. The solution is to add a check after we add the call, exactly as in the first scenario. This does not require another synchronization point. Agreed. The patch seems to be fixing this. Is this sleep(1) intended, or left from debugging? - connection.writeRequest(call, priority); + Thread .sleep(1);
            enis Enis Soztutar added a comment -

            This change might have some consequences, because the connection threads will be kept for much longer. Can we instead do this change only on the test?

            -        conf.getInt("hbase.ipc.client.connection.minIdleTimeBeforeClose", 120000); // 2 minutes
            +        conf.getInt("hbase.ipc.client.connection.minIdleTimeBeforeClose", 1200000); // 20 minutes
            
            enis Enis Soztutar added a comment - This change might have some consequences, because the connection threads will be kept for much longer. Can we instead do this change only on the test? - conf.getInt( "hbase.ipc.client.connection.minIdleTimeBeforeClose" , 120000); // 2 minutes + conf.getInt( "hbase.ipc.client.connection.minIdleTimeBeforeClose" , 1200000); // 20 minutes

            Is this sleep(1) intended, or left from debugging?

            Oops. left from debugging.

            minIdleTimeBeforeClose", 120000

            Yes, I've changed it to see if it was the issue with TestHBaseFsck: if the connection becomes idle, the TestHBaseFsck fails. I will write a test on the RpcClient alone: there is no test for this feature. I can update it for the test itself, but I also need to check what it means for Fsck itself: may be it relies on having the same tcp connection. Lastly, the control of the idle connection is new (previously we were mixing bugs and client pings to keep the connection alive), I don't know what is the best value (but 2 minutes seems much better than 20 for sure)

            nkeywal Nicolas Liochon added a comment - Is this sleep(1) intended, or left from debugging? Oops. left from debugging. minIdleTimeBeforeClose", 120000 Yes, I've changed it to see if it was the issue with TestHBaseFsck: if the connection becomes idle, the TestHBaseFsck fails. I will write a test on the RpcClient alone: there is no test for this feature. I can update it for the test itself, but I also need to check what it means for Fsck itself: may be it relies on having the same tcp connection. Lastly, the control of the idle connection is new (previously we were mixing bugs and client pings to keep the connection alive), I don't know what is the best value (but 2 minutes seems much better than 20 for sure)
            enis Enis Soztutar added a comment -

            v5 lgtm, except for the two issues above.

            enis Enis Soztutar added a comment - v5 lgtm, except for the two issues above.
            enis Enis Soztutar added a comment -

            I've tested v5 (minus the sleep(1)) with

            hbase org.apache.hadoop.hbase.IntegrationTestIngest -Dhbase.ipc.client.allowsInterrupt=true --monkey unbalance
            

            which passes.

            enis Enis Soztutar added a comment - I've tested v5 (minus the sleep(1)) with hbase org.apache.hadoop.hbase.IntegrationTestIngest -Dhbase.ipc.client.allowsInterrupt= true --monkey unbalance which passes.

            Ok the bad scenario for TestFsck was:
            starts
            idle time reached but connection not yet reclaimed
            fsck does a new call
            at this moment the connection is closed as idle
            this should lead to a retry, and we would be done, but it's not enough for TestFsck

            I solved this by managing this specific case explicitly: if the connection was idle but not yet closed and we receive a call, we don't close the connection

            Test added as well.

            nkeywal Nicolas Liochon added a comment - Ok the bad scenario for TestFsck was: starts idle time reached but connection not yet reclaimed fsck does a new call at this moment the connection is closed as idle this should lead to a retry, and we would be done, but it's not enough for TestFsck I solved this by managing this specific case explicitly: if the connection was idle but not yet closed and we receive a call, we don't close the connection Test added as well.
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12630287/10525.v6.patch
            against trunk revision .
            ATTACHMENT ID: 12630287

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 3 new or modified tests.

            +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

            +1 hadoop1.1. The patch compiles against the hadoop 1.1 profile.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 lineLengths. The patch does not introduce lines longer than 100

            +1 site. The mvn site goal succeeds with this patch.

            +1 core tests. The patch passed unit tests in .

            Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//testReport/
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
            Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630287/10525.v6.patch against trunk revision . ATTACHMENT ID: 12630287 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop1.1 . The patch compiles against the hadoop 1.1 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8767//console This message is automatically generated.

            v7 removes the findbugs warning by adding it to the exclusion list. It's useless in this class.

            nkeywal Nicolas Liochon added a comment - v7 removes the findbugs warning by adding it to the exclusion list. It's useless in this class.
            hadoopqa Hadoop QA added a comment -

            +1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12630316/10525.v7.patch
            against trunk revision .
            ATTACHMENT ID: 12630316

            +1 @author. The patch does not contain any @author tags.

            +1 tests included. The patch appears to include 3 new or modified tests.

            +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

            +1 hadoop1.1. The patch compiles against the hadoop 1.1 profile.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 lineLengths. The patch does not introduce lines longer than 100

            +1 site. The mvn site goal succeeds with this patch.

            +1 core tests. The patch passed unit tests in .

            Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//testReport/
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
            Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
            Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630316/10525.v7.patch against trunk revision . ATTACHMENT ID: 12630316 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop1.1 . The patch compiles against the hadoop 1.1 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8768//console This message is automatically generated.
            nkeywal Nicolas Liochon added a comment - - edited

            A green build. How cute.
            Any +1 around ((I will commit with the option NOT activated)?

            nkeywal Nicolas Liochon added a comment - - edited A green build. How cute. Any +1 around ((I will commit with the option NOT activated)?
            enis Enis Soztutar added a comment -

            Let me run the test one more time. I'll +1 after.

            enis Enis Soztutar added a comment - Let me run the test one more time. I'll +1 after.
            enis Enis Soztutar added a comment -

            Ok, the test (mentioned above) succeeds. +1.

            enis Enis Soztutar added a comment - Ok, the test (mentioned above) succeeds. +1.
            ddas Devaraj Das added a comment -

            One quick question - when the connection.interrupt() is invoked the reader thread gets it.. What happens to the writer thread if it was waiting in the callsToWrite.take() call.

            ddas Devaraj Das added a comment - One quick question - when the connection.interrupt() is invoked the reader thread gets it.. What happens to the writer thread if it was waiting in the callsToWrite.take() call.

            when the connection.interrupt() is invoked the reader thread gets it.. What happens to the writer thread if it was waiting in the callsToWrite.take() call.

            In #markClosed, we put a Call to the queue (the DEATH_PILL), this way the writers exits the 'take' method. The reader thread calls #markClosed on any exception, interruptions included.

            nkeywal Nicolas Liochon added a comment - when the connection.interrupt() is invoked the reader thread gets it.. What happens to the writer thread if it was waiting in the callsToWrite.take() call. In #markClosed, we put a Call to the queue (the DEATH_PILL), this way the writers exits the 'take' method. The reader thread calls #markClosed on any exception, interruptions included.

            Committed to trunk, thanks for the reviews, all. (Devaraj, I understood your comment as 'ok if', I can obviously revert / amend if you want more time for the review.). Same goes for everyone who wants to chime in: this code is obviously critical & complex.

            nkeywal Nicolas Liochon added a comment - Committed to trunk, thanks for the reviews, all. (Devaraj, I understood your comment as 'ok if', I can obviously revert / amend if you want more time for the review.). Same goes for everyone who wants to chime in: this code is obviously critical & complex.
            hudson Hudson added a comment -

            FAILURE: Integrated in HBase-TRUNK #4947 (See https://builds.apache.org/job/HBase-TRUNK/4947/)
            HBASE-10525 Allow the client to use a different thread for writing to ease interrupt (nkeywal: rev 1571210)

            • /hbase/trunk/dev-support/findbugs-exclude.xml
            • /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java
            • /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClient.java
            • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
            • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java
            hudson Hudson added a comment - FAILURE: Integrated in HBase-TRUNK #4947 (See https://builds.apache.org/job/HBase-TRUNK/4947/ ) HBASE-10525 Allow the client to use a different thread for writing to ease interrupt (nkeywal: rev 1571210) /hbase/trunk/dev-support/findbugs-exclude.xml /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClient.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java
            ddas Devaraj Das added a comment -

            Yes, nkeywal, i just had that question. Thanks for the clarification.

            ddas Devaraj Das added a comment - Yes, nkeywal , i just had that question. Thanks for the clarification.
            stack Michael Stack added a comment -

            nkeywal

            What is idea of this:

            -Dhbase.ipc.client.allowsInterrupt=true

            Why would we not have interrupt on by default given you have done all this work to make it work?

            Did a quick scan... looks good.

            stack Michael Stack added a comment - nkeywal What is idea of this: -Dhbase.ipc.client.allowsInterrupt=true Why would we not have interrupt on by default given you have done all this work to make it work? Did a quick scan... looks good.
            hudson Hudson added a comment -

            FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #98 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/98/)
            HBASE-10525 Allow the client to use a different thread for writing to ease interrupt (nkeywal: rev 1571210)

            • /hbase/trunk/dev-support/findbugs-exclude.xml
            • /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java
            • /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClient.java
            • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
            • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java
            hudson Hudson added a comment - FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #98 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/98/ ) HBASE-10525 Allow the client to use a different thread for writing to ease interrupt (nkeywal: rev 1571210) /hbase/trunk/dev-support/findbugs-exclude.xml /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClient.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java

            Why would we not have interrupt on by default given you have done all this work to make it work?

            The name may not be perfect: when the option is not activated, an interrupt may cut the tcp connection. activated, the write will be in a different thread, so it won't be cut. However, it adds a Thread per region server, so I didn't activated it by default. In interruption will still be possible, but if they cut the tcp connection, the calls in progress will have to be retried.

            Now that the rpc timeout issue is fixed, I hope to be able to have a single thread pool, so then the option will become useless, but there are a few days of work ahead before removing this... The issue, as usual, is to have something that works well under failures...

            nkeywal Nicolas Liochon added a comment - Why would we not have interrupt on by default given you have done all this work to make it work? The name may not be perfect: when the option is not activated, an interrupt may cut the tcp connection. activated, the write will be in a different thread, so it won't be cut. However, it adds a Thread per region server, so I didn't activated it by default. In interruption will still be possible, but if they cut the tcp connection, the calls in progress will have to be retried. Now that the rpc timeout issue is fixed, I hope to be able to have a single thread pool, so then the option will become useless, but there are a few days of work ahead before removing this... The issue, as usual, is to have something that works well under failures...
            enis Enis Soztutar added a comment -

            I've also committed this to hbase-10070.

            enis Enis Soztutar added a comment - I've also committed this to hbase-10070.
            enis Enis Soztutar added a comment -

            Closing this issue after 0.99.0 release.

            enis Enis Soztutar added a comment - Closing this issue after 0.99.0 release.

            People

              nkeywal Nicolas Liochon
              nkeywal Nicolas Liochon
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: