HBase
  1. HBase
  2. HBASE-5614

createTableAsync gets blocked and it is not an asynchronous operation

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.90.4
    • Fix Version/s: None
    • Component/s: master, regionserver
    • Labels:

      Description

      I have created table A using HBaseAdmin.createTableAsync() with 2000 pre-split regions, the expectation was client RPC call does not get blocked and i had only used 5 minutes as RPC socket timeout value.

      It appears from logs/code that clients gets blocked (honoring the hbase.rpc.timeout value), times out after 5 minutes, master nio channel gets closed (related to client which invoked the createAsyncTable operation), put region entries in inconsistent state in META, meaning info:server is marked as null and AssignmentManager gets null server value (which causes NPE) and causes to table creation fail, i could not succeed until changing the hbase.rpc.timeout value to 15 minutes and created tables one at a time (each table has more than 2K pre-split regions in it)

      Please refer the logs.

        Activity

        Mubarak Seyed created issue -
        Hide
        Mubarak Seyed added a comment -

        HBaseAdmin.createAsyncTable call from client to Master:

        12/03/21 18:00:26 WARN zookeeper.ZKTable: Moving table table_A state to enabled but was already enabled
        12/03/21 18:00:26 INFO wal.HLog: HLog configuration: blocksize=128 MB, rollsize=121.6 MB, enabled=true, optionallogflushinternal=1000ms
        12/03/21 18:00:26 INFO wal.SequenceFileLogWriter: Using syncFs -- HDFS-200
        12/03/21 18:00:26 INFO wal.HLog: New hlog /hbase/root/table_A/6d3fe8aa7eae1c13b4ce5b54a13af1e0/.logs/hlog.1332352826075
        12/03/21 18:00:26 INFO wal.HLog: Using getNumCurrentReplicas--HDFS-826
        12/03/21 18:00:26 INFO regionserver.HRegion: Onlined table_A,,1332352825867.6d3fe8aa7eae1c13b4ce5b54a13af1e0.; next sequenceid=1
        12/03/21 18:00:26 INFO catalog.MetaEditor: Added 1 regions to META
        12/03/21 18:00:26 INFO regionserver.HRegion: Closed table_A,,1332352825867.6d3fe8aa7eae1c13b4ce5b54a13af1e0.
        12/03/21 18:00:26 INFO regionserver.HRegion: Onlined table_A,0015d867,1332352825873.ecf7c1e64eae49c396577e3156bc6555.; next sequenceid=1
        12/03/21 18:00:26 INFO regionserver.HRegion: Closed table_A,0015d867,1332352825873.ecf7c1e64eae49c396577e3156bc6555.
        

        After 5 minutes hbase.rpc.timeout value, master client channel gets ClosedChannelException

        12/03/21 18:05:28 WARN ipc.HBaseServer: IPC Server Responder, call createTable({NAME => 'table_A', FAMILIES => [{NAME => 'CF1s', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'LZO', TTL => '298700000', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}, [[B@25a0504a) from 1.1.1.1:34981: output error
        12/03/21 18:05:28 WARN ipc.HBaseServer: IPC Server handler 73 on 60000 caught: java.nio.channels.ClosedChannelException
                at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
                at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
                at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1341)
                at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727)
                at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:792)
                at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1083)
        

        AssignmentManager gets null value as server location

        12/03/21 18:35:13 INFO master.AssignmentManager: Server null returned java.lang.NullPointerException: Passed server is null for 6d3fe8aa7eae1c13b4ce5b54a13af1e0
        12/03/21 18:35:13 INFO master.AssignmentManager: Server null returned java.lang.NullPointerException: Passed server is null for a47815dfcdf24f49e40f50b6fdd65dd9
        12/03/21 18:35:13 INFO master.AssignmentManager: Server null returned java.lang.NullPointerException: Passed server is null for ecf7c1e64eae49c396577e3156bc6555
        

        From master log, bulk assignment happens (100 regions at a time), i can increase the value to 200 (or higher) but createTableAsync shouldn't get blocked.

        Show
        Mubarak Seyed added a comment - HBaseAdmin.createAsyncTable call from client to Master: 12/03/21 18:00:26 WARN zookeeper.ZKTable: Moving table table_A state to enabled but was already enabled 12/03/21 18:00:26 INFO wal.HLog: HLog configuration: blocksize=128 MB, rollsize=121.6 MB, enabled= true , optionallogflushinternal=1000ms 12/03/21 18:00:26 INFO wal.SequenceFileLogWriter: Using syncFs -- HDFS-200 12/03/21 18:00:26 INFO wal.HLog: New hlog /hbase/root/table_A/6d3fe8aa7eae1c13b4ce5b54a13af1e0/.logs/hlog.1332352826075 12/03/21 18:00:26 INFO wal.HLog: Using getNumCurrentReplicas--HDFS-826 12/03/21 18:00:26 INFO regionserver.HRegion: Onlined table_A,,1332352825867.6d3fe8aa7eae1c13b4ce5b54a13af1e0.; next sequenceid=1 12/03/21 18:00:26 INFO catalog.MetaEditor: Added 1 regions to META 12/03/21 18:00:26 INFO regionserver.HRegion: Closed table_A,,1332352825867.6d3fe8aa7eae1c13b4ce5b54a13af1e0. 12/03/21 18:00:26 INFO regionserver.HRegion: Onlined table_A,0015d867,1332352825873.ecf7c1e64eae49c396577e3156bc6555.; next sequenceid=1 12/03/21 18:00:26 INFO regionserver.HRegion: Closed table_A,0015d867,1332352825873.ecf7c1e64eae49c396577e3156bc6555. After 5 minutes hbase.rpc.timeout value, master client channel gets ClosedChannelException 12/03/21 18:05:28 WARN ipc.HBaseServer: IPC Server Responder, call createTable({NAME => 'table_A', FAMILIES => [{NAME => 'CF1s', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'LZO', TTL => '298700000', BLOCKSIZE => '65536', IN_MEMORY => ' false ', BLOCKCACHE => ' true '}]}, [[B@25a0504a) from 1.1.1.1:34981: output error 12/03/21 18:05:28 WARN ipc.HBaseServer: IPC Server handler 73 on 60000 caught: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1341) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727) at org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:792) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1083) AssignmentManager gets null value as server location 12/03/21 18:35:13 INFO master.AssignmentManager: Server null returned java.lang.NullPointerException: Passed server is null for 6d3fe8aa7eae1c13b4ce5b54a13af1e0 12/03/21 18:35:13 INFO master.AssignmentManager: Server null returned java.lang.NullPointerException: Passed server is null for a47815dfcdf24f49e40f50b6fdd65dd9 12/03/21 18:35:13 INFO master.AssignmentManager: Server null returned java.lang.NullPointerException: Passed server is null for ecf7c1e64eae49c396577e3156bc6555 From master log, bulk assignment happens (100 regions at a time), i can increase the value to 200 (or higher) but createTableAsync shouldn't get blocked.
        Hide
        Gregory Chanan added a comment -

        Are you going to look into fixing this Mubarak? If not, I'll take a look.

        Show
        Gregory Chanan added a comment - Are you going to look into fixing this Mubarak? If not, I'll take a look.
        Hide
        Mubarak Seyed added a comment -

        @Gregory
        I am working on it. Thanks.

        Show
        Mubarak Seyed added a comment - @Gregory I am working on it. Thanks.
        Andrew Purtell made changes -
        Field Original Value New Value
        Labels noob beginner

          People

          • Assignee:
            Unassigned
            Reporter:
            Mubarak Seyed
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:

              Development