Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14926

Hung ThriftServer; no timeout on read from client; if client crashes, worker thread gets stuck reading

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2.0, 1.1.2, 1.3.0, 1.0.3, 0.98.16, 2.0.0
    • 1.2.0, 1.3.0, 0.98.17, 2.0.0
    • Thrift
    • None
    • Reviewed
    • Adds a timeout to server read from clients. Adds new configs hbase.thrift.server.socket.read.timeout for setting read timeout on server socket in milliseconds. Default is 60000;

    Description

      Thrift server is hung. All worker threads are doing this:

      "thrift-worker-0" daemon prio=10 tid=0x00007f0bb95c2800 nid=0xf6a7 runnable [0x00007f0b956e0000]
         java.lang.Thread.State: RUNNABLE
              at java.net.SocketInputStream.socketRead0(Native Method)
              at java.net.SocketInputStream.read(SocketInputStream.java:152)
              at java.net.SocketInputStream.read(SocketInputStream.java:122)
              at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
              at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
              at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
              - locked <0x000000066d859490> (a java.io.BufferedInputStream)
              at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
              at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
              at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
              at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
              at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
              at org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:601)
              at org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:470)
              at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
              at org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)
              at org.apache.hadoop.hbase.thrift.CallQueue$Call.run(CallQueue.java:64)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      

      They never recover.

      I don't have client side logs.

      We've been here before: HBASE-4967 "connected client thrift sockets should have a server side read timeout" but this patch only got applied to fb branch (and thrift has changed since then).

      Attachments

        1. 14926.patch
          11 kB
          Michael Stack
        2. 14926v2.txt
          12 kB
          Michael Stack

        Activity

          People

            stack Michael Stack
            stack Michael Stack
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: