HBase
  1. HBase
  2. HBASE-1849

HTable doesn't work well at the core of a multi-threaded server; e.g. webserver

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Incomplete
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Performance
    • Labels:
      None

      Description

      HTable must do the following:

      + Sit in a shell or simple client – e.g. Map or Reduce task – and feed and read from HBase single-threadedly. It does this job OK.
      + Sit at core of a multithreaded server (100s of threads) – a webserver or thrift gateway – and keep the throughput high. Its currently not good at this job.

      In the way of our achieving the second in the list above are the following:

      + HTable must seekout and cache region locations. It keeps cache down in HConnectionManager. One is shared by all HTable instances if the HTable instance was made with same HBaseConfiguration instance. Lookups of regions is inside a synchronize block; if the region wanted is in the cache, the lock is held a short time. Otherwise, must wait till trip to server completed (may require retries). Meantime all other work is blocked even if we're using HTablePool.
      + Regardless of the identity of the HBaseConfiguration, Hadoop RPC has ONE Connection open to a server at a time; request and response are multiplexed over this single connection.

      Broken stuff:

      + Puts are synchronized to protect the write buffer so only one thread at a time appends but flushcommit is open for any thread to call it. Once the write buffer is full, all Puts block until its freed again. This looks like hang if hundreds of threads and each write is to a random region in a big table and each write has to have its region looked-up (There may be some other brokenness in here because this bottleneck seems to last longer than it should even if hundreds of threads).

      Ideas:

      + Query of the cache does not block all access to the cache. We only block access if wanted region is being looked up so other reads and writes to regions we know the location of can go ahead.
      + nio'd client and server

        Issue Links

          Activity

          stack created issue -
          Jonathan Gray made changes -
          Field Original Value New Value
          Link This issue is related to HBASE-1845 [ HBASE-1845 ]
          Hide
          Jonathan Gray added a comment -

          Great issue, stack.

          Also, we need to consider how to best add additional threading to HTable. Specifically, for MultiGet/Put/Delete, which Erik and I are working on right now over in HBASE-1845. We are not at all taking advantage of running batch puts in parallel right now, and it's especially important for MultiGet which could drastically improve performance by distributing the calls in parallel.

          Additional discussion of the specifics should happen in the other issue, just wanted to link these up.

          Show
          Jonathan Gray added a comment - Great issue, stack. Also, we need to consider how to best add additional threading to HTable. Specifically, for MultiGet/Put/Delete, which Erik and I are working on right now over in HBASE-1845 . We are not at all taking advantage of running batch puts in parallel right now, and it's especially important for MultiGet which could drastically improve performance by distributing the calls in parallel. Additional discussion of the specifics should happen in the other issue, just wanted to link these up.
          Hide
          stack added a comment -

          See the thread dump in https://issues.apache.org/jira/browse/HBASE-1753 for example of how client can get hung up on synchronized batch put in particular.

          Show
          stack added a comment - See the thread dump in https://issues.apache.org/jira/browse/HBASE-1753 for example of how client can get hung up on synchronized batch put in particular.
          Andrew Purtell made changes -
          Link This issue is related to HBASE-2182 [ HBASE-2182 ]
          Hide
          Benoit Sigoure added a comment -

          I've been working on this for the past 2 weeks, although I'm guessing that my solution won't be really satisfactory for this issue. I wrote another HBase client from scratch, and it's been written from the ground up to work well in a multi-threaded environment. I'll open-source it in a few days, stay tuned.

          Show
          Benoit Sigoure added a comment - I've been working on this for the past 2 weeks, although I'm guessing that my solution won't be really satisfactory for this issue. I wrote another HBase client from scratch, and it's been written from the ground up to work well in a multi-threaded environment. I'll open-source it in a few days, stay tuned.
          Benoit Sigoure made changes -
          Assignee Benoit Sigoure [ tsuna ]
          Hide
          ryan rawson added a comment -

          some of the original complaints have been fixed. HTablePool does some things. The advice has generally been dont share HTable between threads.

          The granularity of the locks in HCM were improved and while not all better there are substantial improvements since this issue was filed.

          Show
          ryan rawson added a comment - some of the original complaints have been fixed. HTablePool does some things. The advice has generally been dont share HTable between threads. The granularity of the locks in HCM were improved and while not all better there are substantial improvements since this issue was filed.
          Hide
          stack added a comment -

          @Benôit: Bring it on!

          Show
          stack added a comment - @Benôit: Bring it on!
          Karthick Sankarachary made changes -
          Link This issue is related to HBASE-2939 [ HBASE-2939 ]
          Todd Lipcon made changes -
          Component/s performance [ 12314193 ]
          Hide
          Andrew Purtell added a comment -

          It's not perfect, but the client has come a long way. No action on this issue for a long time, resolving as Incomplete

          Show
          Andrew Purtell added a comment - It's not perfect, but the client has come a long way. No action on this issue for a long time, resolving as Incomplete
          Andrew Purtell made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Assignee Benoit Sigoure [ tsuna ]
          Resolution Incomplete [ 4 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          1765d 19h 1m 1 Andrew Purtell 19/Jul/14 01:46

            People

            • Assignee:
              Unassigned
              Reporter:
              stack
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development