Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Abandoned
-
3.0.0-alpha-1
-
None
Description
This was done on a local cluster (non hdfs) and following are the steps
- Start a single node cluster and start an additional RS using local-regionservers.sh
- Through hbase shell add a new rs group
hbase(main):001:0> add_rsgroup 'test_rsgroup' Took 0.5503 seconds hbase(main):002:0> list_rsgroups NAME SERVER / TABLE test_rsgroup default server dob2-r3n13:16020 server dob2-r3n13:16022 table hbase:meta table hbase:acl table hbase:quota table hbase:namespace table hbase:rsgroup 2 row(s) Took 0.0419 seconds
- Move one of the region servers to the new rsgroup
hbase(main):004:0> move_servers_rsgroup 'test_rsgroup',['dob2-r3n13:16020'] Took 6.4894 seconds hbase(main):005:0> exit
- Stop the regionserver which is left in the default rsgroup
local-regionservers.sh stop 2
The cluster becomes unusable even if the region server is restarted or even if all the services were brought down and brought up.
In 1.1.x version, the cluster recovers fine. Looks like meta is assigned to a dummy regionserver and when the regionserver gets restarted it gets assigned. The following is what we can see in master UI when the rs is down
1588230740 hbase:meta,,1.1588230740 state=PENDING_OPEN, ts=Wed May 23 18:24:01 EDT 2018 (1s ago), server=localhost,1,1