[AMBARI-12488] RU - Use haadmin failover command instead of killing ZKFC during upgrade/downgrade - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Story
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.0.0
Fix Version/s: 2.1.1
Component/s: ambari-server
Labels:
- rolling_upgrade

Description

Currently RU orchestration during upgrade/downgrade kills ZKFC on the active NameNode to initiate a failover to standby. We should instead use the failover command.
E.g.,

su hdfs -c 'hdfs haadmin -failover nn1 nn2'

Where nn1 is the current namenode if it if the active one, and nn2 is the remaining namenode.

This is safer than killing zkfc on the active namenode because this command first tries to gracefully transition a NameNode to the Standby state. If this fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be attempted until one succeeds. After this process the second NameNode will be transitioned to the Active state.

It reduces long waits between ZKFC kill, failure kicking-in after a timeout, and then NN becoming active.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

AMBARI-12488.patch
21/Jul/15 23:10
10 kB
Alejandro Fernandez
AMBARI-12488.v1.patch
23/Jul/15 01:51
0.6 kB
Alejandro Fernandez
AMBARI-12488.v2.patch
23/Jul/15 01:49
0.6 kB
Alejandro Fernandez

Issue Links

links to

Code Review patch

Activity

People

Assignee:: Alejandro Fernandez

Reporter:: Alejandro Fernandez

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 21/Jul/15 23:01

Updated:: 04/Oct/19 16:12

Resolved:: 23/Jul/15 22:41