Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-11355

a couple of callQueue related improvements

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.99.0, 0.94.20
    • Fix Version/s: 0.99.0, 0.98.4
    • Component/s: IPC/RPC, Performance
    • Labels:
      None
    • Release Note:
      Near double the random read throughput (On random test bench with 8 clients running ycsb workloadc against a single server, see throughput go from 180k to 335k ops a second).

      Description

      In one of my in-memory read only testing(100% get requests), one of the top scalibility bottleneck came from the single callQueue. A tentative sharing this callQueue according to the rpc handler number showed a big throughput improvement(the original get() qps is around 60k, after this one and other hotspot tunning, i got 220k get() qps in the same single region server) in a YCSB read only scenario.
      Another stuff we can do is seperating the queue into read call queue and write call queue, we had done it in our internal branch, it would helpful in some outages, to avoid all read or all write requests ran out of all handler threads.
      One more stuff is changing the current blocking behevior once the callQueue is full, considering the full callQueue almost means the backend processing is slow somehow, so a fail-fast here should be more reasonable if we using HBase as a low latency processing system. see "callQueue.put(call)"

      1. HBASE-11355-v2.patch
        32 kB
        Matteo Bertozzi
      2. HBASE-11355-v1.patch
        32 kB
        Matteo Bertozzi
      3. gets.png
        12 kB
        stack
      4. HBASE-11355-v0.patch
        30 kB
        Matteo Bertozzi

        Issue Links

          Activity

          Hide
          stack stack added a comment -

          Just to say that investigating, a long time later, this patch alone doubled our throughput for random read workloads (workloadc in ycsb).

          Show
          stack stack added a comment - Just to say that investigating, a long time later, this patch alone doubled our throughput for random read workloads (workloadc in ycsb).
          Hide
          enis Enis Soztutar added a comment -

          Closing this issue after 0.99.0 release.

          Show
          enis Enis Soztutar added a comment - Closing this issue after 0.99.0 release.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-1.0 #115 (See https://builds.apache.org/job/HBase-1.0/115/)
          HBASE-11737 Document callQueue improvements from HBASE-11355 and HBASE-11724 (Misty Stanley-Jones) (matteo.bertozzi: rev 5c1ae840f21f7a3857543e408ef20a63be2b0751)

          • src/main/docbkx/performance.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-1.0 #115 (See https://builds.apache.org/job/HBase-1.0/115/ ) HBASE-11737 Document callQueue improvements from HBASE-11355 and HBASE-11724 (Misty Stanley-Jones) (matteo.bertozzi: rev 5c1ae840f21f7a3857543e408ef20a63be2b0751) src/main/docbkx/performance.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-TRUNK #5414 (See https://builds.apache.org/job/HBase-TRUNK/5414/)
          HBASE-11737 Document callQueue improvements from HBASE-11355 and HBASE-11724 (Misty Stanley-Jones) (matteo.bertozzi: rev a55a65017cc182e3efd4639e3959af09f178d7d1)

          • src/main/docbkx/performance.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-TRUNK #5414 (See https://builds.apache.org/job/HBase-TRUNK/5414/ ) HBASE-11737 Document callQueue improvements from HBASE-11355 and HBASE-11724 (Misty Stanley-Jones) (matteo.bertozzi: rev a55a65017cc182e3efd4639e3959af09f178d7d1) src/main/docbkx/performance.xml
          Show
          apurtell Andrew Purtell added a comment - See https://issues.apache.org/jira/browse/PHOENIX-938?focusedCommentId=14056785&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14056785 and successor comments.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #348 (See https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/348/)
          HBASE-11355 a couple of callQueue related improvements (matteo.bertozzi: rev f8d50681f628bac50a0082c4c7db52cc0f96ae6a)

          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MultipleQueueRpcExecutor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RWQueueRpcExecutor.java
          • hbase-common/src/main/resources/hbase-default.xml
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SingleQueueRpcExecutor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java
          • hbase-common/src/main/java/org/apache/hadoop/hbase/util/ReflectionUtils.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #348 (See https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/348/ ) HBASE-11355 a couple of callQueue related improvements (matteo.bertozzi: rev f8d50681f628bac50a0082c4c7db52cc0f96ae6a) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MultipleQueueRpcExecutor.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RWQueueRpcExecutor.java hbase-common/src/main/resources/hbase-default.xml hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SingleQueueRpcExecutor.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java hbase-common/src/main/java/org/apache/hadoop/hbase/util/ReflectionUtils.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in HBase-0.98 #368 (See https://builds.apache.org/job/HBase-0.98/368/)
          HBASE-11355 a couple of callQueue related improvements (matteo.bertozzi: rev f8d50681f628bac50a0082c4c7db52cc0f96ae6a)

          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SingleQueueRpcExecutor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RWQueueRpcExecutor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java
          • hbase-common/src/main/java/org/apache/hadoop/hbase/util/ReflectionUtils.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MultipleQueueRpcExecutor.java
          • hbase-common/src/main/resources/hbase-default.xml
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in HBase-0.98 #368 (See https://builds.apache.org/job/HBase-0.98/368/ ) HBASE-11355 a couple of callQueue related improvements (matteo.bertozzi: rev f8d50681f628bac50a0082c4c7db52cc0f96ae6a) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SingleQueueRpcExecutor.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RWQueueRpcExecutor.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java hbase-common/src/main/java/org/apache/hadoop/hbase/util/ReflectionUtils.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MultipleQueueRpcExecutor.java hbase-common/src/main/resources/hbase-default.xml
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in HBase-1.0 #2 (See https://builds.apache.org/job/HBase-1.0/2/)
          HBASE-11355 a couple of callQueue related improvements (matteo.bertozzi: rev 9a6a59c7b7d8357c50fd32f01d0ca21911db3da2)

          • hbase-common/src/main/java/org/apache/hadoop/hbase/util/ReflectionUtils.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SingleQueueRpcExecutor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RWQueueRpcExecutor.java
          • hbase-common/src/main/resources/hbase-default.xml
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MultipleQueueRpcExecutor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in HBase-1.0 #2 (See https://builds.apache.org/job/HBase-1.0/2/ ) HBASE-11355 a couple of callQueue related improvements (matteo.bertozzi: rev 9a6a59c7b7d8357c50fd32f01d0ca21911db3da2) hbase-common/src/main/java/org/apache/hadoop/hbase/util/ReflectionUtils.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SingleQueueRpcExecutor.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RWQueueRpcExecutor.java hbase-common/src/main/resources/hbase-default.xml hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MultipleQueueRpcExecutor.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in HBase-TRUNK #5254 (See https://builds.apache.org/job/HBase-TRUNK/5254/)
          HBASE-11355 a couple of callQueue related improvements (matteo.bertozzi: rev 0e8e41d0ef5985aa3d424621eb8fcb1ca68838ab)

          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MultipleQueueRpcExecutor.java
          • hbase-common/src/main/java/org/apache/hadoop/hbase/util/ReflectionUtils.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SingleQueueRpcExecutor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RWQueueRpcExecutor.java
          • hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java
          • hbase-common/src/main/resources/hbase-default.xml
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #5254 (See https://builds.apache.org/job/HBase-TRUNK/5254/ ) HBASE-11355 a couple of callQueue related improvements (matteo.bertozzi: rev 0e8e41d0ef5985aa3d424621eb8fcb1ca68838ab) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcExecutor.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/MultipleQueueRpcExecutor.java hbase-common/src/main/java/org/apache/hadoop/hbase/util/ReflectionUtils.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SingleQueueRpcExecutor.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RWQueueRpcExecutor.java hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcScheduler.java hbase-common/src/main/resources/hbase-default.xml
          Hide
          apurtell Andrew Purtell added a comment -

          Andrew Purtell You probably want this too sir?

          Yes please! +1

          Show
          apurtell Andrew Purtell added a comment - Andrew Purtell You probably want this too sir? Yes please! +1
          Hide
          stack stack added a comment -

          +1 Looks very nice. Andrew Purtell You probably want this too sir?

          Show
          stack stack added a comment - +1 Looks very nice. Andrew Purtell You probably want this too sir?
          Hide
          stack stack added a comment -

          Matteo Bertozzi That'd be excellent. Suggest enabling multiqueues by default – factor of .1? – since such a nice benefit.

          Amen Anoop Sam John

          Show
          stack stack added a comment - Matteo Bertozzi That'd be excellent. Suggest enabling multiqueues by default – factor of .1? – since such a nice benefit. Amen Anoop Sam John
          Hide
          anoop.hbase Anoop Sam John added a comment -

          As we will need a restart after every config change, it will be excellent if we can auto adjust the number of read/write queues as per the read write pattern
          But obviously another story and a future IA. Just saying for ref.

          Show
          anoop.hbase Anoop Sam John added a comment - As we will need a restart after every config change, it will be excellent if we can auto adjust the number of read/write queues as per the read write pattern But obviously another story and a future IA. Just saying for ref.
          Hide
          mbertozzi Matteo Bertozzi added a comment -

          stack something like this?

          • ipc.server.callqueue.handler.factor
            • Factor to determine the number of call queues. (callq.factor * handlers.count)
            • A value of 0 means a single queue shared between all the handlers.
            • A value of 1 means that each handler has its own queue.
            • A value of 0.5 means that each 2 handlers share the same queue
            • A value > 1 will add more handlers, num queues = num handlers
          • ipc.server.callqueue.read.share</name>
            • Split the call queues into read and write queues.
            • A value of 0 indicate to not split the call queues.
            • A value of 0.5 means there will be the same number of read and write queues
            • A value of 1.0 (or more) means that all the queues except one are used to dispatch read requests.

          the defaults are 0, which means single queue as before. or do you want bump the "ipc.server.callqueue.handler.factor" default to use more than one queue?

          Show
          mbertozzi Matteo Bertozzi added a comment - stack something like this? ipc.server.callqueue.handler.factor Factor to determine the number of call queues. (callq.factor * handlers.count) A value of 0 means a single queue shared between all the handlers. A value of 1 means that each handler has its own queue. A value of 0.5 means that each 2 handlers share the same queue A value > 1 will add more handlers, num queues = num handlers ipc.server.callqueue.read.share</name> Split the call queues into read and write queues. A value of 0 indicate to not split the call queues. A value of 0.5 means there will be the same number of read and write queues A value of 1.0 (or more) means that all the queues except one are used to dispatch read requests. the defaults are 0, which means single queue as before. or do you want bump the "ipc.server.callqueue.handler.factor" default to use more than one queue?
          Hide
          stack stack added a comment -

          Could we have a version that is basic and that makes factor * handler count queues where factor is a number we decide on, say 1/5th or 1/10th. The idea is that this facility is on by default and that user does not have to do configuration to get the basics working. Should they later want to play w/ queue counts, then they can mess with configs setting up explicity queue number and/or read vs write proportions.

          I took a look at the patch, lgtm. One day we should sweep through the code base and just make a Random per handler or so rather than create an instance per usage. Later.

          On:

          + // TODO: Is there a better way to do this?

          ... yeah... but it ain't too bad. You are narrowing when you do the interrogation and then you are looking for the Mutation marker Interface so not too fragile.

          On your model for plugging in different RpcExecutors, its good.

          Make info level:

          + LOG.debug("Using " + callQueueType + " as user call queue, count=" + numCallQueues);

          This is pretty important info...

          This is good:

          • return callQueue.size();
            + return callExecutor.getQueueLength();

          Its great Matteo Bertozzi

          Show
          stack stack added a comment - Could we have a version that is basic and that makes factor * handler count queues where factor is a number we decide on, say 1/5th or 1/10th. The idea is that this facility is on by default and that user does not have to do configuration to get the basics working. Should they later want to play w/ queue counts, then they can mess with configs setting up explicity queue number and/or read vs write proportions. I took a look at the patch, lgtm. One day we should sweep through the code base and just make a Random per handler or so rather than create an instance per usage. Later. On: + // TODO: Is there a better way to do this? ... yeah... but it ain't too bad. You are narrowing when you do the interrogation and then you are looking for the Mutation marker Interface so not too fragile. On your model for plugging in different RpcExecutors, its good. Make info level: + LOG.debug("Using " + callQueueType + " as user call queue, count=" + numCallQueues); This is pretty important info... This is good: return callQueue.size(); + return callExecutor.getQueueLength(); Its great Matteo Bertozzi
          Hide
          apurtell Andrew Purtell added a comment -

          I see more than 50% more throughput when pure random read from cache

          Nice.

          Shall we port to 0.98 also? See above discussion.

          Show
          apurtell Andrew Purtell added a comment - I see more than 50% more throughput when pure random read from cache Nice. Shall we port to 0.98 also? See above discussion.
          Hide
          stack stack added a comment -

          Random reads per second. First hump is w/o patch. Second hump is w/ patch and config applied.

          Here is what I ran:

          for i in c2020 c2022 c2023 c2024 c2025; do echo $i; ssh $i "nohup ./hbase-0.99.0-SNAPSHOT/bin/hbase --config ~/conf_hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --size=2.0 randomRead 30 > pe.out 2> pe.err < /dev/null &"; done

          Show
          stack stack added a comment - Random reads per second. First hump is w/o patch. Second hump is w/ patch and config applied. Here is what I ran: for i in c2020 c2022 c2023 c2024 c2025; do echo $i; ssh $i "nohup ./hbase-0.99.0-SNAPSHOT/bin/hbase --config ~/conf_hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --size=2.0 randomRead 30 > pe.out 2> pe.err < /dev/null &"; done
          Hide
          stack stack added a comment -

          Liang Xie Pilot error.

          I see more than 50% more throughput when pure random read from cache if I apply patch and set the below config:

          <property>
          <name>ipc.server.num.callqueue</name>
          <value>10</value>
          </property>

          My handler count is default for master: i.e. 30.

          Can we enable this by default? Add fat release note and also in hbase-default.xml tie this new config. and handler count at least in the description?

          Show
          stack stack added a comment - Liang Xie Pilot error. I see more than 50% more throughput when pure random read from cache if I apply patch and set the below config: <property> <name>ipc.server.num.callqueue</name> <value>10</value> </property> My handler count is default for master: i.e. 30. Can we enable this by default? Add fat release note and also in hbase-default.xml tie this new config. and handler count at least in the description?
          Hide
          xieliang007 Liang Xie added a comment -

          weird...
          1. a full in-memory random read test, right?
          2. How many ycsb processes and concurrent threads ?
          3. Please ensure it's not a network saturated scenario(e.g. tuning ycsb setting like: readallfields, or small fieldcount/fieldlength in loading phase)
          4. simple tens of thread dumps during testing could give more clues on current hotspot.

          Show
          xieliang007 Liang Xie added a comment - weird... 1. a full in-memory random read test, right? 2. How many ycsb processes and concurrent threads ? 3. Please ensure it's not a network saturated scenario(e.g. tuning ycsb setting like: readallfields, or small fieldcount/fieldlength in loading phase) 4. simple tens of thread dumps during testing could give more clues on current hotspot.
          Hide
          stack stack added a comment -

          Doing basic random read test I don't see benefit just yet. Let me try and load up more concurrency and see if that brings it out....

          Show
          stack stack added a comment - Doing basic random read test I don't see benefit just yet. Let me try and load up more concurrency and see if that brings it out....
          Hide
          stack stack added a comment -

          Looks great. Will give closer review soon. Testing...

          Show
          stack stack added a comment - Looks great. Will give closer review soon. Testing...
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          For RWQueueRpcExecutor :

          +    this(name, Math.max(1, (int)Math.round(handlerCount * writeShare)),
          +      Math.max(1, (int)Math.round(handlerCount * readShare)),
          

          Is writeShare+readShare supposed to be 1.0 ?
          If so, can you add some validation logic ? Specifying one of the two would be enough, right ?

          Show
          yuzhihong@gmail.com Ted Yu added a comment - For RWQueueRpcExecutor : + this (name, Math .max(1, ( int ) Math .round(handlerCount * writeShare)), + Math .max(1, ( int ) Math .round(handlerCount * readShare)), Is writeShare+readShare supposed to be 1.0 ? If so, can you add some validation logic ? Specifying one of the two would be enough, right ?
          Hide
          apurtell Andrew Purtell added a comment -

          My code has changed the scheduler by moving the queues and thread creation in a RpcExecutor class with a SingleQueueRpcExecutor, MuliQueueRpcExecutor, RWQueueRpcExecutor variants. It is simpler for me since I'm experimenting with lots of different queues for HBASE-10994 and that allows me to simply create a different instance of the RpcExecutor without keep changing the SimpleScheduler, but I'm open for suggestion if someone wants this stuff in 0.98 or 1.0

          Changing RPC internals for the sake of improving performance like this in 0.98 isn't a problem as long as public interfaces (including coprocessor related) are not touched. If that happens, then we'd want to do a case by case evaluation.

          I looked at the attached patch and find no reason why it, and its prerequisite changes, could not go into 0.98. It would be good to have an opportunity to try it. If we find a significant regression during release candidate validation we can revert.

          Show
          apurtell Andrew Purtell added a comment - My code has changed the scheduler by moving the queues and thread creation in a RpcExecutor class with a SingleQueueRpcExecutor, MuliQueueRpcExecutor, RWQueueRpcExecutor variants. It is simpler for me since I'm experimenting with lots of different queues for HBASE-10994 and that allows me to simply create a different instance of the RpcExecutor without keep changing the SimpleScheduler, but I'm open for suggestion if someone wants this stuff in 0.98 or 1.0 Changing RPC internals for the sake of improving performance like this in 0.98 isn't a problem as long as public interfaces (including coprocessor related) are not touched. If that happens, then we'd want to do a case by case evaluation. I looked at the attached patch and find no reason why it, and its prerequisite changes, could not go into 0.98. It would be good to have an opportunity to try it. If we find a significant regression during release candidate validation we can revert.
          Hide
          mbertozzi Matteo Bertozzi added a comment -

          Attached a patch that allows you to have multiple queues and the division between reads and writes.

          The default behavior is not changed, single queue with N handlers.

          by setting "ipc.server.num.callqueue" > 1 you get M queues and N handlers (where the N is still "hbase.regionserver.handler.count")

          if the num.callqueus is > 1 and you set "ipc.server.callqueue.read.share" or "ipc.server.callqueue.write.share" you'll get the division between reads and writes. The num handlers will be (handler.count * share) and the number of queues (num.callqueues * share).

          My code has changed the scheduler by moving the queues and thread creation in a RpcExecutor class with a SingleQueueRpcExecutor, MuliQueueRpcExecutor, RWQueueRpcExecutor variants. It is simpler for me since I'm experimenting with lots of different queues for HBASE-10994 and that allows me to simply create a different instance of the RpcExecutor without keep changing the SimpleScheduler, but I'm open for suggestion if someone wants this stuff in 0.98 or 1.0

          Show
          mbertozzi Matteo Bertozzi added a comment - Attached a patch that allows you to have multiple queues and the division between reads and writes. The default behavior is not changed, single queue with N handlers. by setting "ipc.server.num.callqueue" > 1 you get M queues and N handlers (where the N is still "hbase.regionserver.handler.count") if the num.callqueus is > 1 and you set "ipc.server.callqueue.read.share" or "ipc.server.callqueue.write.share" you'll get the division between reads and writes. The num handlers will be (handler.count * share) and the number of queues (num.callqueues * share). My code has changed the scheduler by moving the queues and thread creation in a RpcExecutor class with a SingleQueueRpcExecutor, MuliQueueRpcExecutor, RWQueueRpcExecutor variants. It is simpler for me since I'm experimenting with lots of different queues for HBASE-10994 and that allows me to simply create a different instance of the RpcExecutor without keep changing the SimpleScheduler, but I'm open for suggestion if someone wants this stuff in 0.98 or 1.0
          Hide
          stack stack added a comment -

          Thanks Liang Xie

          Show
          stack stack added a comment - Thanks Liang Xie
          Hide
          xieliang007 Liang Xie added a comment -

          I don't have a normal 0.94 patch, it's a preliminary hack. Other hotspots includes: responseQueuesSizeThrottler, rpcMetrics, scannerReadPoints, etc.
          The minor change about callQueue like below(we had seperated the original callQueue into readCallQueue and writeCallQueue):

          -  protected BlockingQueue<Call> readCallQueue; // read queued calls
          +  protected List<BlockingQueue<Call>> readCallQueues; // read queued calls
          ...
          -          boolean success = readCallQueue.offer(call);
          +          boolean success = readCallQueues.get(rand.nextInt(readHandlerCount)).offer(call);
          ...
          -    this.readCallQueue = new LinkedBlockingQueue<Call>(readQueueLength);
          +    this.readHandlerCount = Math.round(readQueueRatio * handlerCount);
          +    this.readCallQueues = new LinkedList<BlockingQueue<Call>>();
          +    for (int i=0; i< readHandlerCount; i++) {
          +      readCallQueues.add(new LinkedBlockingQueue<Call>(readQueueLength)) ;
          +    }
          

          Every handler thread will consume its own queue, to eliminate the severe contention.
          If considering correctness or more resource consumption, another call queue sharding solution here probably is introducing a queue number setting(i just took handler number for simplify to get a raw perf number), and letting all requests from same client go to the same queue always.

          Show
          xieliang007 Liang Xie added a comment - I don't have a normal 0.94 patch, it's a preliminary hack. Other hotspots includes: responseQueuesSizeThrottler, rpcMetrics, scannerReadPoints, etc. The minor change about callQueue like below(we had seperated the original callQueue into readCallQueue and writeCallQueue): - protected BlockingQueue<Call> readCallQueue; // read queued calls + protected List<BlockingQueue<Call>> readCallQueues; // read queued calls ... - boolean success = readCallQueue.offer(call); + boolean success = readCallQueues.get(rand.nextInt(readHandlerCount)).offer(call); ... - this .readCallQueue = new LinkedBlockingQueue<Call>(readQueueLength); + this .readHandlerCount = Math .round(readQueueRatio * handlerCount); + this .readCallQueues = new LinkedList<BlockingQueue<Call>>(); + for ( int i=0; i< readHandlerCount; i++) { + readCallQueues.add( new LinkedBlockingQueue<Call>(readQueueLength)) ; + } Every handler thread will consume its own queue, to eliminate the severe contention. If considering correctness or more resource consumption, another call queue sharding solution here probably is introducing a queue number setting(i just took handler number for simplify to get a raw perf number), and letting all requests from same client go to the same queue always.
          Hide
          stack stack added a comment -

          A tentative sharing this callQueue according to the rpc handler number...

          Liang Xie What does the above mean? Any chance of your pasting a patch, even if it is just your 0.94 rpc class... we can figure the rest (I have a little rig up here so can test np). Thanks boss.

          Show
          stack stack added a comment - A tentative sharing this callQueue according to the rpc handler number... Liang Xie What does the above mean? Any chance of your pasting a patch, even if it is just your 0.94 rpc class... we can figure the rest (I have a little rig up here so can test np). Thanks boss.
          Hide
          xieliang007 Liang Xie added a comment -

          Thank you, Matteo, good on you

          Show
          xieliang007 Liang Xie added a comment - Thank you, Matteo, good on you
          Hide
          mbertozzi Matteo Bertozzi added a comment -

          I'll take this one, since I'm already playing with multiple call queues as part of HBASE-10994

          Show
          mbertozzi Matteo Bertozzi added a comment - I'll take this one, since I'm already playing with multiple call queues as part of HBASE-10994
          Hide
          xieliang007 Liang Xie added a comment -

          Unassigned from me. would be great if other guys pick it up, i am busy on other stuff.

          Show
          xieliang007 Liang Xie added a comment - Unassigned from me. would be great if other guys pick it up, i am busy on other stuff.
          Hide
          zjushch chunhui shen added a comment -

          220k qps seems a great performance. Waiting for the patch

          Agreeing with the other mentioned stuff, we also seperated the read and write request and make the fail-fast response after queue is full.

          Make these things into different issues?

          Show
          zjushch chunhui shen added a comment - 220k qps seems a great performance. Waiting for the patch Agreeing with the other mentioned stuff, we also seperated the read and write request and make the fail-fast response after queue is full. Make these things into different issues?

            People

            • Assignee:
              mbertozzi Matteo Bertozzi
              Reporter:
              xieliang007 Liang Xie
            • Votes:
              0 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development