HBase
  1. HBase
  2. HBASE-3813

Change RPC callQueue size from "handlerCount * MAX_QUEUE_SIZE_PER_HANDLER;"

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.92.0
    • Fix Version/s: 0.92.0
    • Component/s: None
    • Labels:
      None

      Description

      Yesterday debugging w/ Jack we noticed that with few handlers on a big box, he was seeing stats like this:

      2011-04-21 11:54:49,451 DEBUG org.apache.hadoop.ipc.HBaseServer: Server connection from X.X.X.X:60931; # active connections: 11; # queued calls: 2500
      

      We had 2500 items in the rpc queue waiting to be processed.

      Turns out he had too few handlers for number of clients (but also, it seems like he figured hw issues in that his RAM bus was running at 1/4 the rate that it should have been running at).

      Chatting w/ J-D this morning, he asked if the queues hold 'data'. The queues hold 'Calls'. Calls are the client request. They contain data.

      Jack had 2500 items queued. If each item to insert was 1MB, thats 25k * 1MB of memory that is outside of our generally accounting.

      Currently the queue size is handlers * MAX_QUEUE_SIZE_PER_HANDLER where MAX_QUEUE_SIZE_PER_HANDLER is hardcoded to be 100.

      If the queue is full we block (LinkedBlockingQueue).

      Going to change the queue size from 100 to 10 by default – but also will make it configurable and will doc. this as possible cause of OOME. Will try it on production here before committing patch.

        Activity

        stack created issue -
        Hide
        Todd Lipcon added a comment -

        Can we write our own blocking behavior based on queue memory usage rather than queue count? Not sure how difficult that would be to make Call implement HeapSize

        Show
        Todd Lipcon added a comment - Can we write our own blocking behavior based on queue memory usage rather than queue count? Not sure how difficult that would be to make Call implement HeapSize
        Hide
        stack added a comment -

        Todd, you can do anything! (J/K). Yes, that sounds good. We have 'blocking' going on in app already when memstores fill. I'm thinking though that we'd want to just do crass smaller queues for a 0.90.3 and then a sizing fix for 0.92.0 (We were going to run some tests here on our frontend to make sure no side effects taking the queue size down).

        Show
        stack added a comment - Todd, you can do anything! (J/K). Yes, that sounds good. We have 'blocking' going on in app already when memstores fill. I'm thinking though that we'd want to just do crass smaller queues for a 0.90.3 and then a sizing fix for 0.92.0 (We were going to run some tests here on our frontend to make sure no side effects taking the queue size down).
        Hide
        Gary Helmling added a comment -

        Implementing HeapSize might be kinda tough since the Call -> Invocation -> parameters can be more or less arbitrary objects.

        I wonder if we could do some estimation, though, by wrapping the input stream during de-serialization and counting the number of bytes read?

        Show
        Gary Helmling added a comment - Implementing HeapSize might be kinda tough since the Call -> Invocation -> parameters can be more or less arbitrary objects. I wonder if we could do some estimation, though, by wrapping the input stream during de-serialization and counting the number of bytes read?
        Hide
        Todd Lipcon added a comment -

        I wonder if we could do some estimation, though, by wrapping the input stream during de-serialization and counting the number of bytes read?

        Or push the actual deserialization into the handler thread - ie queue the calls as just the byte buffers, and deserialize once they hit the handler?

        Show
        Todd Lipcon added a comment - I wonder if we could do some estimation, though, by wrapping the input stream during de-serialization and counting the number of bytes read? Or push the actual deserialization into the handler thread - ie queue the calls as just the byte buffers, and deserialize once they hit the handler?
        Hide
        Gary Helmling added a comment -

        Or push the actual deserialization into the handler thread - ie queue the calls as just the byte buffers, and deserialize once they hit the handler?

        Sure, looking through the code, we're already allocating and reading in to a ByteBuffer in HBaseServer.Connection.readAndProcess(). It's just that we immediately convert to a byte[] when passing to processData() where it does the deserialization. Seems like a good gain to move any deserialization overhead out of the listener thread and into the handlers, with some easy memory accounting for the queue to boot.

        Show
        Gary Helmling added a comment - Or push the actual deserialization into the handler thread - ie queue the calls as just the byte buffers, and deserialize once they hit the handler? Sure, looking through the code, we're already allocating and reading in to a ByteBuffer in HBaseServer.Connection.readAndProcess(). It's just that we immediately convert to a byte[] when passing to processData() where it does the deserialization. Seems like a good gain to move any deserialization overhead out of the listener thread and into the handlers, with some easy memory accounting for the queue to boot.
        Hide
        stack added a comment -

        Patch for 0.90 branch.

        My thinking is this needs a fix for 0.90.3. 100 times the handler count can turn ugly real fast if cells are of any significant size and the RS stalls for a moment and queues backup. This patch makes it configurable at least w/ the default tuned down from 100 to be more like 10 or so.

        Todd and Gary, you fellas are talking about a more correct fix. This unaccounted memory usage is going to mess us up over and over again so I think it a critical issue in need of a proper fix but I'm thinking proper fix is over in 0.92.0?

        I'm fine w/ this workaround not going into 0.90.3. Just putting it up here in case folks are amenable.

        Show
        stack added a comment - Patch for 0.90 branch. My thinking is this needs a fix for 0.90.3. 100 times the handler count can turn ugly real fast if cells are of any significant size and the RS stalls for a moment and queues backup. This patch makes it configurable at least w/ the default tuned down from 100 to be more like 10 or so. Todd and Gary, you fellas are talking about a more correct fix. This unaccounted memory usage is going to mess us up over and over again so I think it a critical issue in need of a proper fix but I'm thinking proper fix is over in 0.92.0? I'm fine w/ this workaround not going into 0.90.3. Just putting it up here in case folks are amenable.
        stack made changes -
        Field Original Value New Value
        Attachment 3813.txt [ 12478247 ]
        Hide
        Todd Lipcon added a comment -

        +1 on this fix for now.

        Show
        Todd Lipcon added a comment - +1 on this fix for now.
        Hide
        stack added a comment -

        I applied the workaround attached patch for now. Moving this issue to 0.92.0 for better fix.

        Show
        stack added a comment - I applied the workaround attached patch for now. Moving this issue to 0.92.0 for better fix.
        stack made changes -
        Affects Version/s 0.92.0 [ 12314223 ]
        Affects Version/s 0.90.3 [ 12316313 ]
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #1909 (See https://builds.apache.org/hudson/job/HBase-TRUNK/1909/)

        Show
        Hudson added a comment - Integrated in HBase-TRUNK #1909 (See https://builds.apache.org/hudson/job/HBase-TRUNK/1909/ )
        Hide
        Ted Yu added a comment -

        BlockingQueue has remainingCapacity() that returns the number of additional elements that this queue can ideally (in the absence of memory or resource constraints) accept without blocking.
        Maybe we should write an implementation of BlockingQueue that is aware of the sizes of the objects it holds. Meaning it would block if the next element to be queued would cause total estimated heap consumed to exceed pre-determined threshold.

        I wonder if we should add variant of HbaseObjectWritable.readObject() that records the size of the Object returned.

        Show
        Ted Yu added a comment - BlockingQueue has remainingCapacity() that returns the number of additional elements that this queue can ideally (in the absence of memory or resource constraints) accept without blocking. Maybe we should write an implementation of BlockingQueue that is aware of the sizes of the objects it holds. Meaning it would block if the next element to be queued would cause total estimated heap consumed to exceed pre-determined threshold. I wonder if we should add variant of HbaseObjectWritable.readObject() that records the size of the Object returned.
        Hide
        Ted Yu added a comment -

        We need to consider the fact that after Call object is removed from callQueue, it is enqueued to Connection.responseQueue
        This means in order to limit the heap consumption of Call objects, callQueue and Connection.responseQueue should be managed jointly.

        Show
        Ted Yu added a comment - We need to consider the fact that after Call object is removed from callQueue, it is enqueued to Connection.responseQueue This means in order to limit the heap consumption of Call objects, callQueue and Connection.responseQueue should be managed jointly.
        Hide
        stack added a comment -

        @Ted Sounds good if you can figure the size of a Call object in a non-intrusive way (as per Gary's comment above).

        Show
        stack added a comment - @Ted Sounds good if you can figure the size of a Call object in a non-intrusive way (as per Gary's comment above).
        Hide
        Ted Yu added a comment -

        My proposal doesn't involve moving deserialization overhead into the handlers.
        Primary reason is that we should determine the actual size of the parameter object for the Call.

        So in processData(), we would have:

            HbaseObjectWritable objectWritable = new HbaseObjectWritable();
            Writable param = HbaseObjectWritable.readObject(dis, objectWritable, conf);
        

        I have cloned LinkedBlockingQueueBySize off of LinkedBlockingQueue. Its declaration is:

        public class LinkedBlockingQueueBySize<E extends WritableWithSize> extends AbstractQueue<E>
                implements BlockingQueue<E>, java.io.Serializable {
        

        Then we can utilize this method in HbaseObjectWritable:

          public static long getWritableSize(Object instance, Class declaredClass,
                                             Configuration conf) {
        
        Show
        Ted Yu added a comment - My proposal doesn't involve moving deserialization overhead into the handlers. Primary reason is that we should determine the actual size of the parameter object for the Call. So in processData(), we would have: HbaseObjectWritable objectWritable = new HbaseObjectWritable(); Writable param = HbaseObjectWritable.readObject(dis, objectWritable, conf); I have cloned LinkedBlockingQueueBySize off of LinkedBlockingQueue. Its declaration is: public class LinkedBlockingQueueBySize<E extends WritableWithSize> extends AbstractQueue<E> implements BlockingQueue<E>, java.io.Serializable { Then we can utilize this method in HbaseObjectWritable: public static long getWritableSize( Object instance, Class declaredClass, Configuration conf) {
        Hide
        Ted Yu added a comment -

        w.r.t. moving deserialization overhead into the handlers, it implies we replace the current call queues with (serialized) parameter queue(s).
        Currently we have:

              if (priorityCallQueue != null && getQosLevel(param) > highPriorityLevel) {
                priorityCallQueue.put(call);
        

        getQosLevel() requires deserialized Writable. This means there would be only one parameter queue after this change.

        I assume there would be no call queue because we don't want to keep serialized and unserialized forms of the same parameter at the same time.

        Show
        Ted Yu added a comment - w.r.t. moving deserialization overhead into the handlers, it implies we replace the current call queues with (serialized) parameter queue(s). Currently we have: if (priorityCallQueue != null && getQosLevel(param) > highPriorityLevel) { priorityCallQueue.put(call); getQosLevel() requires deserialized Writable. This means there would be only one parameter queue after this change. I assume there would be no call queue because we don't want to keep serialized and unserialized forms of the same parameter at the same time.
        Hide
        stack added a comment -

        This was committed a while back. Resolving.

        Show
        stack added a comment - This was committed a while back. Resolving.
        stack made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 0.92.0 [ 12314223 ]
        Resolution Fixed [ 1 ]
        Jean-Daniel Cryans made changes -
        Assignee stack [ stack ]

          People

          • Assignee:
            stack
            Reporter:
            stack
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development