Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
2.3.0
-
None
-
None
Description
The master schedules a SCP for the region server hosting meta. However, due to a misconfiguration, the cluster cannot make progress. After fixing the configuration issue and restarting, the cluster still cannot make progress. After the configured period (15 minuets), the master enters a "holding pattern" where it retains Active master status, but isn't taking any action.
This "brown-out" state is toxic. It should either keep trying to make progress, or it should abort. Staying up and not doing anything is the wrong thing to do.