Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.99.0, 2.0.0
    • Component/s: documentation
    • Labels:
      None
    1. HBASE-11737.patch
      3 kB
      Misty Stanley-Jones
    2. HBASE-11737.patch
      8 kB
      Misty Stanley-Jones
    3. HBASE_11737.patch
      8 kB
      Misty Stanley-Jones

      Activity

      Hide
      Enis Soztutar added a comment -

      Closing this issue after 0.99.0 release.

      Show
      Enis Soztutar added a comment - Closing this issue after 0.99.0 release.
      Hide
      Hudson added a comment -

      FAILURE: Integrated in HBase-1.0 #115 (See https://builds.apache.org/job/HBase-1.0/115/)
      HBASE-11737 Document callQueue improvements from HBASE-11355 and HBASE-11724 (Misty Stanley-Jones) (matteo.bertozzi: rev 5c1ae840f21f7a3857543e408ef20a63be2b0751)

      • src/main/docbkx/performance.xml
      Show
      Hudson added a comment - FAILURE: Integrated in HBase-1.0 #115 (See https://builds.apache.org/job/HBase-1.0/115/ ) HBASE-11737 Document callQueue improvements from HBASE-11355 and HBASE-11724 (Misty Stanley-Jones) (matteo.bertozzi: rev 5c1ae840f21f7a3857543e408ef20a63be2b0751) src/main/docbkx/performance.xml
      Hide
      Hudson added a comment -

      FAILURE: Integrated in HBase-TRUNK #5414 (See https://builds.apache.org/job/HBase-TRUNK/5414/)
      HBASE-11737 Document callQueue improvements from HBASE-11355 and HBASE-11724 (Misty Stanley-Jones) (matteo.bertozzi: rev a55a65017cc182e3efd4639e3959af09f178d7d1)

      • src/main/docbkx/performance.xml
      Show
      Hudson added a comment - FAILURE: Integrated in HBase-TRUNK #5414 (See https://builds.apache.org/job/HBase-TRUNK/5414/ ) HBASE-11737 Document callQueue improvements from HBASE-11355 and HBASE-11724 (Misty Stanley-Jones) (matteo.bertozzi: rev a55a65017cc182e3efd4639e3959af09f178d7d1) src/main/docbkx/performance.xml
      Hide
      Matteo Bertozzi added a comment -

      +1

      Show
      Matteo Bertozzi added a comment - +1
      Hide
      Misty Stanley-Jones added a comment -

      OK thanks for the clarification. I think I had a lightbulb moment and added a little more detail to explain, and also made your corrections. Sorry about getting mixed up. By the way the bad math was left over from my first attempt which used 25/50/75/100 but didn't work nicely with 10 queues.

      Show
      Misty Stanley-Jones added a comment - OK thanks for the clarification. I think I had a lightbulb moment and added a little more detail to explain, and also made your corrections. Sorry about getting mixed up. By the way the bad math was left over from my first attempt which used 25/50/75/100 but didn't work nicely with 10 queues.
      Hide
      Matteo Bertozzi added a comment -

      hbase.ipc.server.callqueue.read.ratio
      This factor weights the queues toward reads (if below .5) or writes (if above .5).

      Other way around, the examples are ok except one.

      A value of .6 uses 75% of the queues for writing and 25% for reading. Given a value of 10 for
      hbase.ipc.server.num.callqueue, 7 queues would be used for reads and 3 for writes.</para>

      some weird math in here. 0.6 should give you 60% and not 75% it is basically the reverse of the 0.3 example above which is good. and it is also 60% of reading (only the first part is wrong)

      You can also split the read queues so that separate queues are used for short reads
      (from Get operations) and short reads (from Scan operations)

      short read (get) and long reads (scan) you have two "short" in there.

      Show
      Matteo Bertozzi added a comment - hbase.ipc.server.callqueue.read.ratio This factor weights the queues toward reads (if below .5) or writes (if above .5). Other way around, the examples are ok except one. A value of .6 uses 75% of the queues for writing and 25% for reading. Given a value of 10 for hbase.ipc.server.num.callqueue, 7 queues would be used for reads and 3 for writes.</para> some weird math in here. 0.6 should give you 60% and not 75% it is basically the reverse of the 0.3 example above which is good. and it is also 60% of reading (only the first part is wrong) You can also split the read queues so that separate queues are used for short reads (from Get operations) and short reads (from Scan operations) short read (get) and long reads (scan) you have two "short" in there.
      Hide
      Misty Stanley-Jones added a comment -

      What do you think, Matteo Bertozzi?

      Show
      Misty Stanley-Jones added a comment - What do you think, Matteo Bertozzi ?
      Hide
      Hadoop QA added a comment -

      -1 overall. Here are the results of testing the latest attachment
      http://issues.apache.org/jira/secure/attachment/12661971/HBASE-11737.patch
      against trunk revision .
      ATTACHMENT ID: 12661971

      +1 @author. The patch does not contain any @author tags.

      +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

      +1 javac. The applied patch does not increase the total number of javac compiler warnings.

      +1 javac. The applied patch does not increase the total number of javac compiler warnings.

      +1 javadoc. The javadoc tool did not generate any warning messages.

      +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

      +1 release audit. The applied patch does not increase the total number of release audit warnings.

      +1 lineLengths. The patch does not introduce lines longer than 100

      +1 site. The mvn site goal succeeds with this patch.

      -1 core tests. The patch failed these unit tests:
      org.apache.hadoop.hbase.TestRegionRebalancing
      org.apache.hadoop.hbase.replication.TestPerTableCFReplication

      Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//testReport/
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
      Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//console

      This message is automatically generated.

      Show
      Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12661971/HBASE-11737.patch against trunk revision . ATTACHMENT ID: 12661971 +1 @author . The patch does not contain any @author tags. +0 tests included . The patch appears to be a documentation patch that doesn't require tests. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.TestRegionRebalancing org.apache.hadoop.hbase.replication.TestPerTableCFReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10450//console This message is automatically generated.
      Hide
      Misty Stanley-Jones added a comment -

      Thanks Matteo Bertozzi, let me know if this is better. Also if you could be sure I'm right about how hbase.ipc.server.callqueue.handler.factor works, that would be good. I'm not quite sure about it.

      Show
      Misty Stanley-Jones added a comment - Thanks Matteo Bertozzi , let me know if this is better. Also if you could be sure I'm right about how hbase.ipc.server.callqueue.handler.factor works, that would be good. I'm not quite sure about it.
      Hide
      Matteo Bertozzi added a comment -

      ipc.server.callqueue.handler.factor
      A value between <literal>0</literal> and <literal>1</literal> gives each handler
      a percentage of a queue. For instance, a value of <literal>.5</literal> shares one
      queue between each two handlers.</para>

      Is this correct? I mean the example is correct, but "gives each handler a percentage of a queue"
      to me sounds like the other way around where 0 means share nothing and 1 share all.
      buy maybe is just me not able to read it correctly.

      You can also add that the benefit of having multiple queues (e.g. 1 per handler) means that there is less contention when the task is added to/select from the queue, which result is better performance.
      but it also means that if you have 2 queues, and 1 ends up with task that takes long you end up with one handler waiting to receive the next call instead of executing the pending ones in the other queue.

      read.share was renamed to read.ratio (no need to doc the change seen no release was released with .share)
      I've added also more examples after a discussion with jon, which you should include.

      The specified interval (which should be between 0.0 and 1.0) will be multiplied by the number of call queues.
      A value of 0 indicate to not split the call queues, meaning that both read and write requests will be pushed to the same set of queues.
      A value lower than 0.5 means that there will be less read queues than write queues.
      A value of 0.5 means there will be the same number of read and write queues.
      A value greater than 0.5 means that there will be more read queues than write queues.
      A value of 1.0 means that all the queues except one are used to dispatch read requests.
      
      Example: Given the total number of call queues being 10
      a read.ratio of 0 means that: the 10 queues will contain both read/write requests.
      a read.ratio of 0.3 means that: 3 queues will contain only read requests and 7 queues will contain only write requests.
      a read.ratio of 0.5 means that: 5 queues will contain only read requests and 5 queues will contain only write requests.
      a read.ratio of 0.8 means that: 8 queues will contain only read requests and 2 queues will contain only write requests.
      a read.ratio of 1 means that: 9 queues will contain only read requests and 1 queues will contain only write requests.
      

      Also, add something like: separating the number of read/write queues can be used to "prioritize" read vs writes, less queues you have more "throttling" you have on that operation.
      but separating read and write queues also means that reads will never be stuck waiting a write operation to complete. (dumb example, 2 handler 1 queue you have a seq of WRITE, WRITE, READ the read must wait the writes to complete, if you have the 2 separate queue and 1 handler is processing only the write queue and the other only the read queue at any point in time you are executing a read and a write)

      There is also a new scan.ratio property that splits the read queues in long-read and short-read

      the scan.ratio property will split the read call queues into small-read and long-read queues.
      A value lower than 0.5 means that there will be less long-read queues than short-read queues.
      A value of 0.5 means that there will be the same number of short-read and long-read queues.
      A value greater than 0.5 means that there will be more long-read queues than short-read queues
      A value of 0 or 1 indicate to use the same set of queues for gets and scans.
      
      Example: Given the total number of read call queues being 8
      a scan.ratio of 0 or 1 means that: 8 queues will contain both long and short read requests.
      a scan.ratio of 0.3 means that: 2 queues will contain only long-read requests and 6 queues will contain only short-read requests.
      a scan.ratio of 0.5 means that: 4 queues will contain only long-read requests and 4 queues will contain only short-read requests.
      a scan.ratio of 0.8 means that: 6 queues will contain only long-read requests and 2 queues will contain only short-read requests.
      

      and again, by dividing long-reads from short-reads you can "prioritize" what you need.
      (same stuff as read/write but with long/short reads)

      said that, this property are meant mainly for perf testing, unless you really know what you are doing
      since they "fixed" for the RS and if you want to change them you have to restart the RS.
      The idea is to have them dynamically configurable by user/table/namespace once we have quotas
      and maybe at some point autotunables based on the workload stats.

      Show
      Matteo Bertozzi added a comment - ipc.server.callqueue.handler.factor A value between <literal>0</literal> and <literal>1</literal> gives each handler a percentage of a queue. For instance, a value of <literal>.5</literal> shares one queue between each two handlers.</para> Is this correct? I mean the example is correct, but "gives each handler a percentage of a queue" to me sounds like the other way around where 0 means share nothing and 1 share all. buy maybe is just me not able to read it correctly. You can also add that the benefit of having multiple queues (e.g. 1 per handler) means that there is less contention when the task is added to/select from the queue, which result is better performance. but it also means that if you have 2 queues, and 1 ends up with task that takes long you end up with one handler waiting to receive the next call instead of executing the pending ones in the other queue. read.share was renamed to read.ratio (no need to doc the change seen no release was released with .share) I've added also more examples after a discussion with jon, which you should include. The specified interval (which should be between 0.0 and 1.0) will be multiplied by the number of call queues. A value of 0 indicate to not split the call queues, meaning that both read and write requests will be pushed to the same set of queues. A value lower than 0.5 means that there will be less read queues than write queues. A value of 0.5 means there will be the same number of read and write queues. A value greater than 0.5 means that there will be more read queues than write queues. A value of 1.0 means that all the queues except one are used to dispatch read requests. Example: Given the total number of call queues being 10 a read.ratio of 0 means that: the 10 queues will contain both read/write requests. a read.ratio of 0.3 means that: 3 queues will contain only read requests and 7 queues will contain only write requests. a read.ratio of 0.5 means that: 5 queues will contain only read requests and 5 queues will contain only write requests. a read.ratio of 0.8 means that: 8 queues will contain only read requests and 2 queues will contain only write requests. a read.ratio of 1 means that: 9 queues will contain only read requests and 1 queues will contain only write requests. Also, add something like: separating the number of read/write queues can be used to "prioritize" read vs writes, less queues you have more "throttling" you have on that operation. but separating read and write queues also means that reads will never be stuck waiting a write operation to complete. (dumb example, 2 handler 1 queue you have a seq of WRITE, WRITE, READ the read must wait the writes to complete, if you have the 2 separate queue and 1 handler is processing only the write queue and the other only the read queue at any point in time you are executing a read and a write) There is also a new scan.ratio property that splits the read queues in long-read and short-read the scan.ratio property will split the read call queues into small-read and long-read queues. A value lower than 0.5 means that there will be less long-read queues than short-read queues. A value of 0.5 means that there will be the same number of short-read and long-read queues. A value greater than 0.5 means that there will be more long-read queues than short-read queues A value of 0 or 1 indicate to use the same set of queues for gets and scans. Example: Given the total number of read call queues being 8 a scan.ratio of 0 or 1 means that: 8 queues will contain both long and short read requests. a scan.ratio of 0.3 means that: 2 queues will contain only long-read requests and 6 queues will contain only short-read requests. a scan.ratio of 0.5 means that: 4 queues will contain only long-read requests and 4 queues will contain only short-read requests. a scan.ratio of 0.8 means that: 6 queues will contain only long-read requests and 2 queues will contain only short-read requests. and again, by dividing long-reads from short-reads you can "prioritize" what you need. (same stuff as read/write but with long/short reads) said that, this property are meant mainly for perf testing, unless you really know what you are doing since they "fixed" for the RS and if you want to change them you have to restart the RS. The idea is to have them dynamically configurable by user/table/namespace once we have quotas and maybe at some point autotunables based on the workload stats.
      Hide
      Matteo Bertozzi added a comment -

      introduces several callQueue improvements, which can increase performance. See the JIRA for some benchmarking information

      "Improvements" seems something like "on by default". We don't have anything on by default.
      Is more like "new options to experiments with tunings".

      "ipc.server.callqueue." There was a jira that you documented that the options were renamed in "hbase.ipc..."

      for read.share see HBASE-11724 (in-progress), apparently the doc with 0, 0.5 and 1 is not clear enough.

      overall this doc doesn't seems to add any value to what is already in hbase-default.xml
      I think that the doc should provide more detailed information on why increasing that number is good or bad, what will be the result and so on. I'll try to come up with something for you.

      Show
      Matteo Bertozzi added a comment - introduces several callQueue improvements, which can increase performance. See the JIRA for some benchmarking information "Improvements" seems something like "on by default". We don't have anything on by default. Is more like "new options to experiments with tunings". "ipc.server.callqueue." There was a jira that you documented that the options were renamed in "hbase.ipc..." for read.share see HBASE-11724 (in-progress), apparently the doc with 0, 0.5 and 1 is not clear enough. overall this doc doesn't seems to add any value to what is already in hbase-default.xml I think that the doc should provide more detailed information on why increasing that number is good or bad, what will be the result and so on. I'll try to come up with something for you.
      Hide
      Misty Stanley-Jones added a comment -

      Ready for review.

      Show
      Misty Stanley-Jones added a comment - Ready for review.

        People

        • Assignee:
          Misty Stanley-Jones
          Reporter:
          Misty Stanley-Jones
        • Votes:
          0 Vote for this issue
          Watchers:
          5 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development