Good point. StandbyException is perhaps a poor name, or should perhaps be a subclass of UnsupportedActionException. Should we perhaps move UnsupportedActionException to some common package? It's certainly not HA- or even HDFS-specific.
It makes sense to move UnsupportedActionException to a common package.
The layer which performs client failover should be able to determine whether or not an IPC failed because the call was made to the wrong node in a set of HA servers which provide a service. That is why I introduced StandbyException at the IPC layer.
My concern with the common change is adding a notion of StandbyException to IPC. We could have a separate discussion about whether the notion of StandbyException belongs to IPC layer or to the service. One thing I have been thinking is for upper layers to install error/exception handlers. That upper layers could use exception handler to handle StandbyException.
Right, but I'm under the impression that exitState and enterState are the interface methods which should be override by the state subclasses to perform the steps of state transition. The implementations you've provided for standbyToActive and activeToStandby are either just no-ops or aliases for setState(some state), and therefore seem unnecessary.
exitState performs the cleanup required when moving out of a state and enterState performs initialization required for entering state. setState() which is a final method facilitates for all state transitions, the necessary step. I see that this has caused confusion. I also think that having a method to transition to every state, means too many methods in HAState. So I have changed the patch to use setState() and setStateInternal(). See if it makes more sense now.
ActiveState.checkOperation doesn't actually check if the op is of type JOURNAL, although its comment claims to allow all operations other than JOURNAL.
That is because currently NamenodeProtocols does not extend JournalProtocol. When BackupNode code merges with Namenode, ActiveState rejecting JOURNAL operations will become necessary.
I have added checkOperation checks for read operations. There are some operations which are administrative. I have TODO:HA as place holders for now, as this requires more thinking on how should this be allowed. Will open a jira to address the same.