Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: ha, namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently namenode supports active, secondary and backup roles. To support namenode high availability, active and standby states are needed. Note that this is different from the existing notion of namenode role, where a namenode cannot transition from one role to the other.

      1. HDFS-1974.1.patch
        29 kB
        Suresh Srinivas
      2. HDFS-1974.2.patch
        27 kB
        Suresh Srinivas
      3. HDFS-1974.3.patch
        25 kB
        Suresh Srinivas
      4. HDFS-1974.4.patch
        28 kB
        Suresh Srinivas
      5. HDFS-1974.5.patch
        32 kB
        Suresh Srinivas
      6. HDFS-1974.6.patch
        32 kB
        Suresh Srinivas
      7. HDFS-1974.patch
        27 kB
        Suresh Srinivas

        Issue Links

          Activity

          Hide
          shv Konstantin Shvachko added a comment -

          Namenode does not have "secondary" role.

          > Note that this is different from the existing notion of namenode role

          It is the same notion to me. NameNode in current implementation cannot transition from one role to another. But in HA world it should.

          Show
          shv Konstantin Shvachko added a comment - Namenode does not have "secondary" role. > Note that this is different from the existing notion of namenode role It is the same notion to me. NameNode in current implementation cannot transition from one role to another. But in HA world it should.
          Hide
          sureshms Suresh Srinivas added a comment -

          Early version of the patch, introduces state machines for active/standby.

          I will update the design document in HDFS-1623 with details and to this jira, once the implementation is done.

          Show
          sureshms Suresh Srinivas added a comment - Early version of the patch, introduces state machines for active/standby. I will update the design document in HDFS-1623 with details and to this jira, once the implementation is done.
          Hide
          sureshms Suresh Srinivas added a comment -

          > It is the same notion to me. NameNode in current implementation cannot transition from one role to another. But in HA world it should.
          I think we are discussing this in HDFS-2141. Lets continue there.

          Show
          sureshms Suresh Srinivas added a comment - > It is the same notion to me. NameNode in current implementation cannot transition from one role to another. But in HA world it should. I think we are discussing this in HDFS-2141 . Lets continue there.
          Hide
          tlipcon Todd Lipcon added a comment -

          Taking a quick look at the patch. A few thoughts:

          • rather than adding more things to NN, would it be possible to get the state management in a separate class? NameNode.java already has a lot of "stuff" in it, which can make it hard to mock, etc.
          • I see ACTIVE_STATE and STANDBY_STATE are static variables. Does that imply that these State classes will have no member variables? If they do have member variables, it will make testing very difficult, since we need multiple NNs in a Minicluster
          • I've been thinking about NameNode.java a bit with regard to HA: I think it might make this work a little cleaner if we were to do a bit of refactoring first. I'm imagining it would be broken up so that NameNode itself becomes just a container class, which contains NameNodeClientService (implements ClientProtocol), NameNodeDatanodeService (implements DatanodeProtocol), NameNodeHttpService (starts web server), etc. Then NameNode itself would do state management and transitions, and move the actual RPC handling code out to other spots. What do you think?
          Show
          tlipcon Todd Lipcon added a comment - Taking a quick look at the patch. A few thoughts: rather than adding more things to NN, would it be possible to get the state management in a separate class? NameNode.java already has a lot of "stuff" in it, which can make it hard to mock, etc. I see ACTIVE_STATE and STANDBY_STATE are static variables. Does that imply that these State classes will have no member variables? If they do have member variables, it will make testing very difficult, since we need multiple NNs in a Minicluster I've been thinking about NameNode.java a bit with regard to HA: I think it might make this work a little cleaner if we were to do a bit of refactoring first. I'm imagining it would be broken up so that NameNode itself becomes just a container class, which contains NameNodeClientService (implements ClientProtocol), NameNodeDatanodeService (implements DatanodeProtocol), NameNodeHttpService (starts web server), etc. Then NameNode itself would do state management and transitions, and move the actual RPC handling code out to other spots. What do you think?
          Hide
          sureshms Suresh Srinivas added a comment -

          rather than adding more things to NN, would it be possible to get the state management in a separate class? NameNode.java already has a lot of "stuff" in it, which can make it hard to mock, etc.

          I am open to making the states a separate class. The reason I am making it inner class is to allow states access to NN internals. But we could expose only the NN interface needed to states.

          I see ACTIVE_STATE and STANDBY_STATE are static variables. Does that imply that these State classes will have no member variables? If they do have member variables, it will make testing very difficult, since we need multiple NNs in a Minicluster.

          This should not be a problem. NN is passed as parameter to these states and in multi cluster env, different NN references are passed. BTW this is based on state machine pattern - with NN as the context.

          NameNodeClientService (implements ClientProtocol), NameNodeDatanodeService (implements DatanodeProtocol), NameNodeHttpService (starts web server), etc.

          I think it is a good idea. I have been thinking about this as well. Currently, I am looking at the all the internal NN services that needs to be started when NN enters/exits active and standby states (see TODO:HA in the current patch). Sanjay has been thinking about this as well for some protocol compatibility cleanup and he might be on his way to doing this separation.

          Show
          sureshms Suresh Srinivas added a comment - rather than adding more things to NN, would it be possible to get the state management in a separate class? NameNode.java already has a lot of "stuff" in it, which can make it hard to mock, etc. I am open to making the states a separate class. The reason I am making it inner class is to allow states access to NN internals. But we could expose only the NN interface needed to states. I see ACTIVE_STATE and STANDBY_STATE are static variables. Does that imply that these State classes will have no member variables? If they do have member variables, it will make testing very difficult, since we need multiple NNs in a Minicluster. This should not be a problem. NN is passed as parameter to these states and in multi cluster env, different NN references are passed. BTW this is based on state machine pattern - with NN as the context. NameNodeClientService (implements ClientProtocol), NameNodeDatanodeService (implements DatanodeProtocol), NameNodeHttpService (starts web server), etc. I think it is a good idea. I have been thinking about this as well. Currently, I am looking at the all the internal NN services that needs to be started when NN enters/exits active and standby states (see TODO:HA in the current patch). Sanjay has been thinking about this as well for some protocol compatibility cleanup and he might be on his way to doing this separation.
          Hide
          sureshms Suresh Srinivas added a comment -

          NameNodeClientService (implements ClientProtocol), NameNodeDatanodeService (implements DatanodeProtocol), NameNodeHttpService (starts web server), etc.

          Forgot to mention this earlier, other issue with this approach doing the following when starting single RPC server is not easy:

          this.server = RPC.getServer(NamenodeProtocols.class, this,
                                          socAddr.getHostName(), socAddr.getPort(),
                                          handlerCount, false, conf, 
                                          namesystem.getDelegationTokenSecretManager());
          
          Show
          sureshms Suresh Srinivas added a comment - NameNodeClientService (implements ClientProtocol), NameNodeDatanodeService (implements DatanodeProtocol), NameNodeHttpService (starts web server), etc. Forgot to mention this earlier, other issue with this approach doing the following when starting single RPC server is not easy: this.server = RPC.getServer(NamenodeProtocols.class, this, socAddr.getHostName(), socAddr.getPort(), handlerCount, false, conf, namesystem.getDelegationTokenSecretManager());
          Hide
          sureshms Suresh Srinivas added a comment -

          Updated patch with States moved to separate class.

          Show
          sureshms Suresh Srinivas added a comment - Updated patch with States moved to separate class.
          Hide
          shv Konstantin Shvachko added a comment -

          > I will update the design document ... once the implementation is done.

          Shouldn't it be the other way around?

          Show
          shv Konstantin Shvachko added a comment - > I will update the design document ... once the implementation is done. Shouldn't it be the other way around?
          Hide
          sureshms Suresh Srinivas added a comment -

          Well, to see how this works in the code, I had earlier implementation. I posted it to get initial feedback. I agree with you, will post the design document soon.

          Show
          sureshms Suresh Srinivas added a comment - Well, to see how this works in the code, I had earlier implementation. I posted it to get initial feedback. I agree with you, will post the design document soon.
          Hide
          sureshms Suresh Srinivas added a comment -

          Here is design notes.

          Namenode will have two states related to high availability:

          1. Active - In this state namenode supports all the services related to namespace and operations related to it.
          2. Standby - In this state, namenode acts as warm standby and will have updated namespace and block locations map.

          State machine implementation:

          State machine is implemented using state pattern - http://en.wikipedia.org/wiki/State_pattern.

          The base state has the following main methods:
          noformat
          abstract class State {
          /** Handler to transition from standby to active state */
          protected abstract void standbyToActive(NameNode nn)

          /** Handler to transition from active to standby state */
          protected abstract void activeToStandby(NameNode nn)
          throws ServiceFailedException;

          /** Method to be overridden by subclasses to perform steps start services
          required while entering a state. */
          protected void enterState(NameNode nn) throws ServiceFailedException

          { // Nothing to do }

          /** Method to be overridden by subclasses to cleanup/stop services while
          exiting a state. */
          protected void exitState(NameNode nn) throws ServiceFailedException { // Nothing to do }

          }
          noformat

          Key points:

          1. State does not have any members of its own. It only implements the logic necessary with in a state. Namenode is the context for the state, using which state performs necessary function.
          2. Active and Standby extend this state.
          3. Namenode will have singletons for ActiveState and StandbyState.
            • Namenode#state points to either ActiveState or StandbyState.
            • There is no need to write if (state is Active) then do. Such decision is deleagated to the state it self. The concrete state will provide necessary specialization.
            • Some examples such specialization: checking if a certain read/write/checkpoint operation is supported in a state. For example ActiveState will allow Read/Write operations. StandbyState currently does not support any of these operations. We could enhance StandbyState to support read operations in the future.

          Open items that will be addressed in a separate jira:

          1. What services should start when entering ActiveState and StandbyState. What services should stop when exiting ActiveState and StandbyState.
          2. Should StandbyState allow read operations.

          Please see the initial code. Will udpate HDFS-1623 with the design information.

          Show
          sureshms Suresh Srinivas added a comment - Here is design notes. Namenode will have two states related to high availability: Active - In this state namenode supports all the services related to namespace and operations related to it. Standby - In this state, namenode acts as warm standby and will have updated namespace and block locations map. State machine implementation: State machine is implemented using state pattern - http://en.wikipedia.org/wiki/State_pattern . The base state has the following main methods: noformat abstract class State { /** Handler to transition from standby to active state */ protected abstract void standbyToActive(NameNode nn) /** Handler to transition from active to standby state */ protected abstract void activeToStandby(NameNode nn) throws ServiceFailedException; /** Method to be overridden by subclasses to perform steps start services required while entering a state. */ protected void enterState(NameNode nn) throws ServiceFailedException { // Nothing to do } /** Method to be overridden by subclasses to cleanup/stop services while exiting a state. */ protected void exitState(NameNode nn) throws ServiceFailedException { // Nothing to do } } noformat Key points: State does not have any members of its own. It only implements the logic necessary with in a state. Namenode is the context for the state, using which state performs necessary function. Active and Standby extend this state. Namenode will have singletons for ActiveState and StandbyState. Namenode#state points to either ActiveState or StandbyState. There is no need to write if (state is Active) then do. Such decision is deleagated to the state it self. The concrete state will provide necessary specialization. Some examples such specialization: checking if a certain read/write/checkpoint operation is supported in a state. For example ActiveState will allow Read/Write operations. StandbyState currently does not support any of these operations. We could enhance StandbyState to support read operations in the future. Open items that will be addressed in a separate jira: What services should start when entering ActiveState and StandbyState. What services should stop when exiting ActiveState and StandbyState. Should StandbyState allow read operations. Please see the initial code. Will udpate HDFS-1623 with the design information.
          Hide
          tlipcon Todd Lipcon added a comment -

          I opened HDFS-2180 to refactor out the HTTP server into its own class. I figure we can tackle the refactoring bit-by-bit to make it easy to review.

          Show
          tlipcon Todd Lipcon added a comment - I opened HDFS-2180 to refactor out the HTTP server into its own class. I figure we can tackle the refactoring bit-by-bit to make it easy to review.
          Hide
          tlipcon Todd Lipcon added a comment -

          I opened HDFS-2197 to refactor the RPC implementations to a new class

          Show
          tlipcon Todd Lipcon added a comment - I opened HDFS-2197 to refactor the RPC implementations to a new class
          Hide
          sureshms Suresh Srinivas added a comment -

          Todd, can you please run test-patch before committing the changes.

          Show
          sureshms Suresh Srinivas added a comment - Todd, can you please run test-patch before committing the changes.
          Hide
          sureshms Suresh Srinivas added a comment -

          Wrong jira... Ignore my previous comment.

          Show
          sureshms Suresh Srinivas added a comment - Wrong jira... Ignore my previous comment.
          Hide
          atm Aaron T. Myers added a comment -

          Thanks for posting this design doc, Suresh. Some comments on the latest patch:

          1. public static boolean isHAEnabled(Configuration conf) - is the intention that this will return true if HA is enabled on the cluster? Or that this will return true only on the currently-active NN? The answer to this question may invalidate the next question.
          2. this.state = DFSUtil.isHAEnabled(conf) ? ACTIVE_STATE : STANDBY_STATE; - should probably be replaced with a TODO. It's obviously not reasonable to have both NNs start as active just because HA is enabled. There needs to be some leader determination.
          3. In HAState, why not make enterState and exitState abstract?
          4. I find it unfortunate/brittle that every FS operation call in NameNode now has to include the method name as a string, particularly since this is only used for logging. Is the stack trace, which will include the method name, not sufficient?
          5. UnsupportedActionException seems redundant with o.a.h.ipc.StandbyException.
          6. In NameNode.OperationCategory, the comments are identical for CHECKPOINT and JOURNAL operations. For that matter, I'm not sure which operations are intended to fall under JOURNAL.
          7. StandbyState.checkOperation for the time being, it doesn't make sense that we should be serving reads from the standby. We don't yet have any guarantees about the freshness of the standby's date. Let's leave that for a later optimization.
          8. I don't understand the purpose of having both HAState.setState and HAState.(standbyToActive|activeToStandby). Shouldn't just setState be sufficient?
          9. In HAState.setState, you have exitState(nn); s.enterState(nn);. What state should the NN be considered to be in between these calls?
          10. You never reassign the value of NameNode.state.
          11. There's no synchronization in this patch. Many or most of these operations will need some locking.
          12. It seems like all of the new classes should go in the o.a.h.hdfs.server.namenode.ha package, rather than just o.a.h.hdfs.server.namenode.
          Show
          atm Aaron T. Myers added a comment - Thanks for posting this design doc, Suresh. Some comments on the latest patch: public static boolean isHAEnabled(Configuration conf) - is the intention that this will return true if HA is enabled on the cluster? Or that this will return true only on the currently-active NN? The answer to this question may invalidate the next question. this.state = DFSUtil.isHAEnabled(conf) ? ACTIVE_STATE : STANDBY_STATE; - should probably be replaced with a TODO. It's obviously not reasonable to have both NNs start as active just because HA is enabled. There needs to be some leader determination. In HAState , why not make enterState and exitState abstract? I find it unfortunate/brittle that every FS operation call in NameNode now has to include the method name as a string, particularly since this is only used for logging. Is the stack trace, which will include the method name, not sufficient? UnsupportedActionException seems redundant with o.a.h.ipc.StandbyException . In NameNode.OperationCategory , the comments are identical for CHECKPOINT and JOURNAL operations. For that matter, I'm not sure which operations are intended to fall under JOURNAL . StandbyState.checkOperation for the time being, it doesn't make sense that we should be serving reads from the standby. We don't yet have any guarantees about the freshness of the standby's date. Let's leave that for a later optimization. I don't understand the purpose of having both HAState.setState and HAState.(standbyToActive|activeToStandby) . Shouldn't just setState be sufficient? In HAState.setState , you have exitState(nn); s.enterState(nn); . What state should the NN be considered to be in between these calls? You never reassign the value of NameNode.state . There's no synchronization in this patch. Many or most of these operations will need some locking. It seems like all of the new classes should go in the o.a.h.hdfs.server.namenode.ha package, rather than just o.a.h.hdfs.server.namenode .
          Hide
          sureshms Suresh Srinivas added a comment -

          Thanks for the comments. I have addressed most of them in the new patch.

          public static boolean isHAEnabled(Configuration conf) - is the intention that this will return true if HA is enabled on the cluster? yes

          this.state = DFSUtil.isHAEnabled(conf) ? ACTIVE_STATE : STANDBY_STATE; - should probably be replaced with a TODO. It's obviously not reasonable to have both NNs start as active just because HA is enabled. There needs to be some leader determination.

          I meant !isHAEnabled, then retain the older functionality. New patch addresses it.

          UnsupportedActionException seems redundant with o.a.h.ipc.StandbyException.

          How do you handle operations related to checkpointing and journaling that namenode in it's various states/roles cannot satisfy?

          Further we need to think if StandbyException should be at IPC layer? Is this not Service related exception?

          I don't understand the purpose of having both HAState.setState and HAState.(standbyToActive|activeToStandby). Shouldn't just setState be sufficient?

          Setting a state involves multiple steps - that existing current state and entering a new state. standbyToActive and activeToStandby are interface methods.

          In HAState.setState, you have exitState(nn); s.enterState(nn);. What state should the NN be considered to be in between these calls?

          exitState does the cleanup required for exiting a state, with in the state. I have made a change to set the state of NN before enterState.

          Show
          sureshms Suresh Srinivas added a comment - Thanks for the comments. I have addressed most of them in the new patch. public static boolean isHAEnabled(Configuration conf) - is the intention that this will return true if HA is enabled on the cluster? yes this.state = DFSUtil.isHAEnabled(conf) ? ACTIVE_STATE : STANDBY_STATE; - should probably be replaced with a TODO. It's obviously not reasonable to have both NNs start as active just because HA is enabled. There needs to be some leader determination. I meant !isHAEnabled, then retain the older functionality. New patch addresses it. UnsupportedActionException seems redundant with o.a.h.ipc.StandbyException. How do you handle operations related to checkpointing and journaling that namenode in it's various states/roles cannot satisfy? Further we need to think if StandbyException should be at IPC layer? Is this not Service related exception? I don't understand the purpose of having both HAState.setState and HAState.(standbyToActive|activeToStandby). Shouldn't just setState be sufficient? Setting a state involves multiple steps - that existing current state and entering a new state. standbyToActive and activeToStandby are interface methods. In HAState.setState, you have exitState(nn); s.enterState(nn);. What state should the NN be considered to be in between these calls? exitState does the cleanup required for exiting a state, with in the state. I have made a change to set the state of NN before enterState.
          Hide
          sureshms Suresh Srinivas added a comment -

          Changes:

          1. Added HA enabled checks in to some of the methods
          2. Added Journal OperationCategory type and it is allowed only in BackupNode.
          3. Removed methods in BackupNode in lieu of checks already made in NameNode.
          Show
          sureshms Suresh Srinivas added a comment - Changes: Added HA enabled checks in to some of the methods Added Journal OperationCategory type and it is allowed only in BackupNode. Removed methods in BackupNode in lieu of checks already made in NameNode.
          Hide
          atm Aaron T. Myers added a comment -

          Thanks for addressing my comments, Suresh. Comments inline.

          How do you handle operations related to checkpointing and journaling that namenode in it's various states/roles cannot satisfy?

          Good point. StandbyException is perhaps a poor name, or should perhaps be a subclass of UnsupportedActionException. Should we perhaps move UnsupportedActionException to some common package? It's certainly not HA- or even HDFS-specific.

          Further we need to think if StandbyException should be at IPC layer? Is this not Service related exception?

          The layer which performs client failover should be able to determine whether or not an IPC failed because the call was made to the wrong node in a set of HA servers which provide a service. That is why I introduced StandbyException at the IPC layer.

          Setting a state involves multiple steps - that existing current state and entering a new state. standbyToActive and activeToStandby are interface methods.

          Right, but I'm under the impression that exitState and enterState are the interface methods which should be override by the state subclasses to perform the steps of state transition. The implementations you've provided for standbyToActive and activeToStandby are either just no-ops or aliases for setState(some state), and therefore seem unnecessary.

          exitState does the cleanup required for exiting a state, with in the state. I have made a change to set the state of NN before enterState.

          I don't see where this change was made in the latest patch.

          A few more comments:

          1. Why aren't you adding checkOperation calls in all the read operations of the NN?
          2. ActiveState.checkOperation doesn't actually check if the op is of type JOURNAL, although its comment claims to allow all operations other than JOURNAL.
          Show
          atm Aaron T. Myers added a comment - Thanks for addressing my comments, Suresh. Comments inline. How do you handle operations related to checkpointing and journaling that namenode in it's various states/roles cannot satisfy? Good point. StandbyException is perhaps a poor name, or should perhaps be a subclass of UnsupportedActionException. Should we perhaps move UnsupportedActionException to some common package? It's certainly not HA- or even HDFS-specific. Further we need to think if StandbyException should be at IPC layer? Is this not Service related exception? The layer which performs client failover should be able to determine whether or not an IPC failed because the call was made to the wrong node in a set of HA servers which provide a service. That is why I introduced StandbyException at the IPC layer. Setting a state involves multiple steps - that existing current state and entering a new state. standbyToActive and activeToStandby are interface methods. Right, but I'm under the impression that exitState and enterState are the interface methods which should be override by the state subclasses to perform the steps of state transition. The implementations you've provided for standbyToActive and activeToStandby are either just no-ops or aliases for setState(some state) , and therefore seem unnecessary. exitState does the cleanup required for exiting a state, with in the state. I have made a change to set the state of NN before enterState. I don't see where this change was made in the latest patch. A few more comments: Why aren't you adding checkOperation calls in all the read operations of the NN? ActiveState.checkOperation doesn't actually check if the op is of type JOURNAL , although its comment claims to allow all operations other than JOURNAL .
          Hide
          sureshms Suresh Srinivas added a comment -

          Good point. StandbyException is perhaps a poor name, or should perhaps be a subclass of UnsupportedActionException. Should we perhaps move UnsupportedActionException to some common package? It's certainly not HA- or even HDFS-specific.

          It makes sense to move UnsupportedActionException to a common package.

          The layer which performs client failover should be able to determine whether or not an IPC failed because the call was made to the wrong node in a set of HA servers which provide a service. That is why I introduced StandbyException at the IPC layer.

          My concern with the common change is adding a notion of StandbyException to IPC. We could have a separate discussion about whether the notion of StandbyException belongs to IPC layer or to the service. One thing I have been thinking is for upper layers to install error/exception handlers. That upper layers could use exception handler to handle StandbyException.

          Right, but I'm under the impression that exitState and enterState are the interface methods which should be override by the state subclasses to perform the steps of state transition. The implementations you've provided for standbyToActive and activeToStandby are either just no-ops or aliases for setState(some state), and therefore seem unnecessary.

          exitState performs the cleanup required when moving out of a state and enterState performs initialization required for entering state. setState() which is a final method facilitates for all state transitions, the necessary step. I see that this has caused confusion. I also think that having a method to transition to every state, means too many methods in HAState. So I have changed the patch to use setState() and setStateInternal(). See if it makes more sense now.

          ActiveState.checkOperation doesn't actually check if the op is of type JOURNAL, although its comment claims to allow all operations other than JOURNAL.

          That is because currently NamenodeProtocols does not extend JournalProtocol. When BackupNode code merges with Namenode, ActiveState rejecting JOURNAL operations will become necessary.

          I have added checkOperation checks for read operations. There are some operations which are administrative. I have TODO:HA as place holders for now, as this requires more thinking on how should this be allowed. Will open a jira to address the same.

          Show
          sureshms Suresh Srinivas added a comment - Good point. StandbyException is perhaps a poor name, or should perhaps be a subclass of UnsupportedActionException. Should we perhaps move UnsupportedActionException to some common package? It's certainly not HA- or even HDFS-specific. It makes sense to move UnsupportedActionException to a common package. The layer which performs client failover should be able to determine whether or not an IPC failed because the call was made to the wrong node in a set of HA servers which provide a service. That is why I introduced StandbyException at the IPC layer. My concern with the common change is adding a notion of StandbyException to IPC. We could have a separate discussion about whether the notion of StandbyException belongs to IPC layer or to the service. One thing I have been thinking is for upper layers to install error/exception handlers. That upper layers could use exception handler to handle StandbyException. Right, but I'm under the impression that exitState and enterState are the interface methods which should be override by the state subclasses to perform the steps of state transition. The implementations you've provided for standbyToActive and activeToStandby are either just no-ops or aliases for setState(some state), and therefore seem unnecessary. exitState performs the cleanup required when moving out of a state and enterState performs initialization required for entering state. setState() which is a final method facilitates for all state transitions, the necessary step. I see that this has caused confusion. I also think that having a method to transition to every state, means too many methods in HAState. So I have changed the patch to use setState() and setStateInternal(). See if it makes more sense now. ActiveState.checkOperation doesn't actually check if the op is of type JOURNAL, although its comment claims to allow all operations other than JOURNAL. That is because currently NamenodeProtocols does not extend JournalProtocol. When BackupNode code merges with Namenode, ActiveState rejecting JOURNAL operations will become necessary. I have added checkOperation checks for read operations. There are some operations which are administrative. I have TODO:HA as place holders for now, as this requires more thinking on how should this be allowed. Will open a jira to address the same.
          Hide
          sureshms Suresh Srinivas added a comment -

          Minor updates to patch.

          Show
          sureshms Suresh Srinivas added a comment - Minor updates to patch.
          Hide
          tlipcon Todd Lipcon added a comment -

          That is because currently NamenodeProtocols does not extend JournalProtocol. When BackupNode code merges with Namenode, ActiveState rejecting JOURNAL operations will become necessary.

          Using Sanjay's new support for mix-in RPC implementations, we can continue to keep JournalProtocol separate, and only "mix it in" when starting up as the BN.

          Show
          tlipcon Todd Lipcon added a comment - That is because currently NamenodeProtocols does not extend JournalProtocol. When BackupNode code merges with Namenode, ActiveState rejecting JOURNAL operations will become necessary. Using Sanjay's new support for mix-in RPC implementations, we can continue to keep JournalProtocol separate, and only "mix it in" when starting up as the BN.
          Hide
          sureshms Suresh Srinivas added a comment -

          Using Sanjay's new support for mix-in RPC implementations, we can continue to keep JournalProtocol separate, and only "mix it in" when starting up as the BN.

          Good idea. But you still need your active state to make the change, that Aaron suggested.

          Show
          sureshms Suresh Srinivas added a comment - Using Sanjay's new support for mix-in RPC implementations, we can continue to keep JournalProtocol separate, and only "mix it in" when starting up as the BN. Good idea. But you still need your active state to make the change, that Aaron suggested.
          Hide
          atm Aaron T. Myers added a comment -

          Thanks a lot for addressing my concerns, Suresh. The latest patch looks a lot better to me.

          My concern with the common change is adding a notion of StandbyException to IPC. We could have a separate discussion about whether the notion of StandbyException belongs to IPC layer or to the service. One thing I have been thinking is for upper layers to install error/exception handlers. That upper layers could use exception handler to handle StandbyException.

          My point isn't that StandbyException is necessarily the right approach, just that I do think the IPC layer should be responsible for the mechanics of client failover, and thus that there should be some way for an exception thrown by a remote service to indicate to the client that a request should be retried against another server. But, this discussion should probably happen on a separate JIRA.

          I also think that having a method to transition to every state, means too many methods in HAState. So I have changed the patch to use setState() and setStateInternal(). See if it makes more sense now.

          I agree. The latest patch looks much better to me in this regard.

          Two final comments. +1 once these are addressed:

          1. The two-argument constructor in UnsupportedActionException doesn't actually use the action parameter.
          2. I don't think listCorruptFileBlocks is a write operation.
          Show
          atm Aaron T. Myers added a comment - Thanks a lot for addressing my concerns, Suresh. The latest patch looks a lot better to me. My concern with the common change is adding a notion of StandbyException to IPC. We could have a separate discussion about whether the notion of StandbyException belongs to IPC layer or to the service. One thing I have been thinking is for upper layers to install error/exception handlers. That upper layers could use exception handler to handle StandbyException. My point isn't that StandbyException is necessarily the right approach, just that I do think the IPC layer should be responsible for the mechanics of client failover, and thus that there should be some way for an exception thrown by a remote service to indicate to the client that a request should be retried against another server. But, this discussion should probably happen on a separate JIRA. I also think that having a method to transition to every state, means too many methods in HAState. So I have changed the patch to use setState() and setStateInternal(). See if it makes more sense now. I agree. The latest patch looks much better to me in this regard. Two final comments. +1 once these are addressed: The two-argument constructor in UnsupportedActionException doesn't actually use the action parameter. I don't think listCorruptFileBlocks is a write operation.
          Hide
          sureshms Suresh Srinivas added a comment -

          New patch that addresses the comments from Aaron. Here is the test-patch result:
          [exec] -1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
          [exec] Please justify why no new tests are needed for this patch.
          [exec] Also please list what manual steps were performed to verify this patch.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
          [exec]
          [exec] -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings).
          [exec]
          [exec] +1 system test framework. The patch passed system test framework compile.
          [exec]

          Release audit comes from missing banner in CHANGES.HDFS-1623.txt. Will fix that in a different commit.

          Show
          sureshms Suresh Srinivas added a comment - New patch that addresses the comments from Aaron. Here is the test-patch result: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] -1 release audit. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). [exec] [exec] +1 system test framework. The patch passed system test framework compile. [exec] Release audit comes from missing banner in CHANGES. HDFS-1623 .txt. Will fix that in a different commit.
          Hide
          sureshms Suresh Srinivas added a comment -

          I committed the patch to HDFS-1623 branch.

          Show
          sureshms Suresh Srinivas added a comment - I committed the patch to HDFS-1623 branch.

            People

            • Assignee:
              sureshms Suresh Srinivas
              Reporter:
              sureshms Suresh Srinivas
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development