Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-8762

Performance/operational penalty when calling HTable.get with a list of one Get

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.98.0, 0.95.2, 0.94.9
    • Client
    • None
    • Reviewed

    Description

      There are two implications to calling HTable.get with a list of one Get.
      1. The overhead of processBatch is paid unnecessarily, which is not insignificant.
      2. The get requests show up as a 'multi' when reviewing RPC handlers, when the request should just be a single Get. It seems likely that there are other places in logs/ui it shows up as a multi as well.

      To give some context to the overhead, here are some timings performed by a member of our team:

      In a very simple test, of reading the same key 100 times, taking the time it took, and then repeating this 10 times (1000 total gets), the times are as follows (excluding the actual first iteration as there was considerable HBase warm-up times on the JVM for establishing connections):

      Iteration Batch (in ms) Single (in ms)
      1 2255 815
      2 1545 823
      3 1427 742
      4 1451 721
      5 1480 775
      6 1379 735
      7 1657 775
      8 1392 804

      While I can see the argument that callers should use the single Get method signature, the cost implications are somewhat surprising and it's very easy to be smart in this case. We simply need to have HTable.get(List<Get>) delegate to HTable.get(<Get>) if the list has one Get.

      Attachments

        1. HBASE-8762.patch
          0.5 kB
          Jason Bray
        2. HBASE-8672-trunk.patch
          0.5 kB
          Jason Bray

        Activity

          People

            Unassigned Unassigned
            jasonbray Jason Bray
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: