While attempting to restart a component that has complicated upgrade logic, the upgrade parameters are not sent to the agents. This can cause some components to fails during a suspended upgrade restart.
- Begin express upgrade from 18.104.22.168-3796 to 22.214.171.124-37
- HIVE_METASTORE couldn't start b/c of a missing Kerberos property:
- Chose to Ignore and Proceed which means that none of the Metastore SQL files ran.
- Paused the upgrade (presumably at Finalize) and try to start Metastore. It fails to start because the new HDP 2.5 bits are using a non-upgraded database. That causes the -info option to fail and makes Ambari think it needs to run -initSchema.
RCA: Metastore failed to start during upgrade and the admin chose to skip it. This caused schema upgrade logic not to run. Ambari can examine the upgrade_suspended property to determine if we need to run upgrade commands while restarting Metastore during an upgrade.
However, it might be more prudent to simply send along the suspended upgrade properties so that any actions which might need to happen (such as invoking an upgrade script during the restart) can happen when the upgrade is suspended.