[HBASE-4033] The shutdown RegionServer could be added to AssignmentManager.servers again - ASF JIRA

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.90.3
Fix Version/s: 0.90.4
Component/s: master
Labels:
None

Hadoop Flags:

Reviewed

Description

The folling steps can easily recreate the problem:
1. There's thousands of regions in the cluster.
2. Stop the cluster.
3. Start the cluster. Killing one regionserver while the regions were opening. Restarted it after 10 seconds.

The shutted regionserver will appear in the AssignmentManager.servers list again.

For example:

Issue 1:

2011-06-23 14:14:30,775 DEBUG org.apache.hadoop.hbase.master.LoadBalancer: Server information: 167-6-1-12,20020,1308803390123=2220, 167-6-1-13,20020,1308803391742=2374, 167-6-1-11,20020,1308803386333=2205, 167-6-1-13,20020,1308803514394=2183

Two regionservers(One of it had aborted) had the same hostname but different startcode:
167-6-1-13,20020,1308803391742=2374
167-6-1-13,20020,1308803514394=2183

Issue 2:

(1).The Rs 167-6-1-11,20020,1308105402003 finished shutdown at "10:46:37,774":
10:46:37,774 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of 167-6-1-11,20020,1308105402003

(2).Overwriting happened, it seemed the RS was still exist in the set of AssignmentManager#regions:
10:45:55,081 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 612342de1fe4733f72299d70addb6d11 on serverName=167-6-1-11,20020,1308105402003, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)

(3).Region was assigned to this dead RS again at "10:50:20,671":
10:50:20,671 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region Jeason10,08058613800000030,1308032774777.612342de1fe4733f72299d70addb6d11. to 167-6-1-11,20020,1308105402003

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

A_hbase-root-master-167-6-1-11.rar
25/Jun/11 02:15
2.66 MB
Jieshan Bean
analysis.gif
27/Jun/11 11:06
17 kB
Jieshan Bean
test-report.txt
30/Jun/11 08:44
22 kB
Jieshan Bean
HBASE-4033-90-V1.patch
30/Jun/11 09:29
3 kB
Jieshan Bean
HBASE-4033-trunk-V1.patch
30/Jun/11 09:30
3 kB
Jieshan Bean
HBASE-4033-90-V2.patch
05/Jul/11 01:42
1 kB
Jieshan Bean
HBASE-4033-trunk-V2.patch
07/Jul/11 08:19
1 kB
Jieshan Bean

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Jieshan Bean

Reporter:: Jieshan Bean

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 25/Jun/11 02:11

Updated:: 20/Nov/15 11:54

Resolved:: 07/Jul/11 19:58

Agile

View on Board

The shutdown RegionServer could be added to AssignmentManager.servers again

Details

Description

Attachments

Attachments

Activity

People

Dates

Agile

Slack

Issue deployment