Hadoop Common
  1. Hadoop Common
  2. HADOOP-7472

RPC client should deal with the IP address changes

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.20.205.0
    • Fix Version/s: 0.20.205.0, 0.23.0
    • Component/s: ipc
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The current RPC client implementation and the client-side callers assume that the hostname-address mappings of servers never change. The resolved address is stored in an immutable InetSocketAddress object above/outside RPC, and the reconnect logic in the RPC Connection implementation also trusts the resolved address that was passed down.

      If the NN suffers a failure that requires migration, it may be started on a different node with a different IP address. In this case, even if the name-address mapping is updated in DNS, the cluster is stuck trying old address until the whole cluster is restarted.

      The RPC client-side should detect this situation and exit or try to recover.

      Updating ConnectionId within the Client implementation may get the system work for the moment, there always is a risk of the cached address:port become connectable again unintentionally. The real solution will be notifying upper layer of the address change so that they can re-resolve and retry or re-architecture the system as discussed in HDFS-34.

      For 0.20 lines, some type of compromise may be acceptable. For example, raise a custom exception for some well-defined high-impact upper layer to do re-resolve/retry, while other will have to restart. For TRUNK, the HA work will most likely determine what needs to be done. So this Jira won't cover the solutions for TRUNK.

      1. addr_change_dfs_trunk-3.patch.txt
        2 kB
        Kihwal Lee
      2. addr_change_dfs_0_20s-2.patch.txt
        2 kB
        Kihwal Lee
      3. addr_change_dfs_trunk-2.patch.txt
        2 kB
        Kihwal Lee
      4. addr_change_dfs_0_20s-1.patch.txt
        2 kB
        Kihwal Lee
      5. addr_change_dfs_trunk-1.patch.txt
        3 kB
        Kihwal Lee
      6. addr_change_dfs_0_20s.patch.txt
        2 kB
        Kihwal Lee
      7. addr_change_dfs_trunk.patch.txt
        3 kB
        Kihwal Lee
      8. addr_change_dfs-3.patch.txt
        2 kB
        Kihwal Lee
      9. addr_change_dfs-2.patch.txt
        3 kB
        Kihwal Lee
      10. addr_change_dfs-1.patch.txt
        18 kB
        Kihwal Lee
      11. addr_change_dfs.patch.txt
        20 kB
        Kihwal Lee

        Issue Links

          Activity

          Hide
          Steve Loughran added a comment -

          This is surprisingly hard to do in Java. HDFS-34 does cover the JVM caching problems, and if you are running a caching DNS server then it's even harder. It is not sufficient for the RPC do redo an nslookup, as changes won't be seen. Instead you need to go one step back, have the URI to the FS refer to something that isnt a hostname (e.g some zookeeper thing) and then on a connection failure the client and DNs need to reread that value and see where they need to talk to

          Show
          Steve Loughran added a comment - This is surprisingly hard to do in Java. HDFS-34 does cover the JVM caching problems, and if you are running a caching DNS server then it's even harder. It is not sufficient for the RPC do redo an nslookup, as changes won't be seen. Instead you need to go one step back, have the URI to the FS refer to something that isnt a hostname (e.g some zookeeper thing) and then on a connection failure the client and DNs need to reread that value and see where they need to talk to
          Hide
          Kihwal Lee added a comment -

          Yes, it is a hard problem and I hope it is addressed in the HA work. Whereas this JIRA is for improving the system behavior during certain class of failures, specifically for the 0.20 lines, but not necessarily for achieving a very high cluster availability through automatic failover within the cluster.

          For the failure scenario described in this JIRA, the recovery involves manual steps today. The consensus/fencing is forced manually and physically, and the failure detection latencies in Hadoop can be relatively long (currently infinite in some cases). When the cluster is restarted, the system loses a lot of state and kindly asking users to resubmit their jobs is not exactly we are thrilled about doing.

          Here is what I think we can do (for 0.20):

          • Improve the failure detection in software for the scenario described above. If the NN became unavailable and would come back on the same node, there will be no false alarms. Easy to do.
          • Improve some components to do a smart recovery when such a failure is detected. For example, it will be nice if this can be done for JT so that it doesn't lose its state.

          I wouldn't call it HA, but it still can improve MTTR and reduce the impact on customers.

          Show
          Kihwal Lee added a comment - Yes, it is a hard problem and I hope it is addressed in the HA work. Whereas this JIRA is for improving the system behavior during certain class of failures, specifically for the 0.20 lines, but not necessarily for achieving a very high cluster availability through automatic failover within the cluster. For the failure scenario described in this JIRA, the recovery involves manual steps today. The consensus/fencing is forced manually and physically, and the failure detection latencies in Hadoop can be relatively long (currently infinite in some cases). When the cluster is restarted, the system loses a lot of state and kindly asking users to resubmit their jobs is not exactly we are thrilled about doing. Here is what I think we can do (for 0.20): Improve the failure detection in software for the scenario described above. If the NN became unavailable and would come back on the same node, there will be no false alarms. Easy to do. Improve some components to do a smart recovery when such a failure is detected. For example, it will be nice if this can be done for JT so that it doesn't lose its state. I wouldn't call it HA, but it still can improve MTTR and reduce the impact on customers.
          Hide
          Suresh Srinivas added a comment -

          http://download.oracle.com/javase/6/docs/api/java/net/InetAddress.html
          By default, when a security manager is installed, in order to protect against DNS spoofing attacks, the result of positive host name resolutions are cached forever. When a security manager is not installed, the default behavior is to cache entries for a finite (implementation dependent) period of time. The result of unsuccessful host name resolution is cached for a very short period of time (10 seconds) to improve performance.

          In setups where security manage is not installed, the test program I wrote did pickup new ip address for a host in 15 minutes on Sun JDK. So re-resolving the address for such setup is still a viable solution. I am not sure how other implementations do.

          Show
          Suresh Srinivas added a comment - http://download.oracle.com/javase/6/docs/api/java/net/InetAddress.html By default, when a security manager is installed, in order to protect against DNS spoofing attacks, the result of positive host name resolutions are cached forever. When a security manager is not installed, the default behavior is to cache entries for a finite (implementation dependent) period of time. The result of unsuccessful host name resolution is cached for a very short period of time (10 seconds) to improve performance. In setups where security manage is not installed, the test program I wrote did pickup new ip address for a host in 15 minutes on Sun JDK. So re-resolving the address for such setup is still a viable solution. I am not sure how other implementations do.
          Hide
          Kihwal Lee added a comment -

          Re-resolving part is fine. But the fact that the callers keep the old address in their immutable InetSocketAddress object can cause problems. Since we cannot afford to check the mapping in every RPC invocation by re-resolving, the connection will go through if NN is started with the old address by mistake. Also the connection cache in RPC Client uses the address as the key. Do we keep the old key? Or update with the new one? Either way it's not clean as long as the upper layer keeps using the old resolved address.

          Show
          Kihwal Lee added a comment - Re-resolving part is fine. But the fact that the callers keep the old address in their immutable InetSocketAddress object can cause problems. Since we cannot afford to check the mapping in every RPC invocation by re-resolving, the connection will go through if NN is started with the old address by mistake. Also the connection cache in RPC Client uses the address as the key. Do we keep the old key? Or update with the new one? Either way it's not clean as long as the upper layer keeps using the old resolved address.
          Hide
          Steve Loughran added a comment -

          re-resolution could be part of the handling of connection failures.

          1. When a DN or TT spins waiting for a master to come back up, it should re-resolve the hostname every time.
          2. For other clients, on a socket connect failure (refused, timeout, etc), a re-resolution could be attempted. I wouldn't hide this down the bottom though, better to make visible for the HDFS code to retry.

          I can see that testing this is going to be fun.

          Show
          Steve Loughran added a comment - re-resolution could be part of the handling of connection failures. When a DN or TT spins waiting for a master to come back up, it should re-resolve the hostname every time. For other clients, on a socket connect failure (refused, timeout, etc), a re-resolution could be attempted. I wouldn't hide this down the bottom though, better to make visible for the HDFS code to retry. I can see that testing this is going to be fun.
          Hide
          Kihwal Lee added a comment -

          My current approach is to check for an address change on connection failures and response timeouts. If a change is detected, the callers who are waiting on the server will get an exception. I am focusing on the DFS side for now, since it has the highest impact. Other users of RPC Client will get an exception, but I am not looking at adding recovery mechanisms there at this time. The important ones are for talking to NN and for others it usually make more sense to fail early.

          Again, this is not for the service level HA. Callers will see failures and they need to have a retry logic to recover. But they won't be stuck forever anymore.

          I will post a patch soon, once I do more testing. Any suggestions on how it can be tested and what kind of test be added? I cannot think of any meaningful test that can be added with the patch.

          Show
          Kihwal Lee added a comment - My current approach is to check for an address change on connection failures and response timeouts. If a change is detected, the callers who are waiting on the server will get an exception. I am focusing on the DFS side for now, since it has the highest impact. Other users of RPC Client will get an exception, but I am not looking at adding recovery mechanisms there at this time. The important ones are for talking to NN and for others it usually make more sense to fail early. Again, this is not for the service level HA. Callers will see failures and they need to have a retry logic to recover. But they won't be stuck forever anymore. I will post a patch soon, once I do more testing. Any suggestions on how it can be tested and what kind of test be added? I cannot think of any meaningful test that can be added with the patch.
          Hide
          Kihwal Lee added a comment -

          The attached patch includes changes to DFSClient and ipc.Client. The RPC callers will get InvalidRPCAddressException when the IP address of the name node changes. DFSClient handles this exception so that the RPC client is recreated with the new address. Subsequent calls are directed to the new name node.

          This patch doesn't modify other RPC callers. They will get an exception and some will recover automatically, others won't. The datanode heartbeat seems to recover automatically. I assume the cluster will be restarted with a new config anyway when the name node has to move.

          There is no test included in this patch. I tested it by flipping system configuration and killing/starting the namenode at the right moment and order. I welcome any suggestion. If no one can come up with a reasonable automated test for our existing testing framework, I will request for an exemption after more through testing on a real cluster.

          Show
          Kihwal Lee added a comment - The attached patch includes changes to DFSClient and ipc.Client. The RPC callers will get InvalidRPCAddressException when the IP address of the name node changes. DFSClient handles this exception so that the RPC client is recreated with the new address. Subsequent calls are directed to the new name node. This patch doesn't modify other RPC callers. They will get an exception and some will recover automatically, others won't. The datanode heartbeat seems to recover automatically. I assume the cluster will be restarted with a new config anyway when the name node has to move. There is no test included in this patch. I tested it by flipping system configuration and killing/starting the namenode at the right moment and order. I welcome any suggestion. If no one can come up with a reasonable automated test for our existing testing framework, I will request for an exemption after more through testing on a real cluster.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12487564/addr_change_dfs.patch.txt
          against trunk revision 1148933.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/760//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12487564/addr_change_dfs.patch.txt against trunk revision 1148933. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/760//console This message is automatically generated.
          Hide
          Kihwal Lee added a comment -

          The patch is for branch-0.20-security. The precommit failed because it tried to apply the patch to TRUNK.

          Show
          Kihwal Lee added a comment - The patch is for branch-0.20-security. The precommit failed because it tried to apply the patch to TRUNK.
          Hide
          Kihwal Lee added a comment -

          Removed a line of debug code that was commented out.

          Show
          Kihwal Lee added a comment - Removed a line of debug code that was commented out.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12487718/addr_change_dfs-1.patch.txt
          against trunk revision 1150565.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/764//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12487718/addr_change_dfs-1.patch.txt against trunk revision 1150565. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/764//console This message is automatically generated.
          Hide
          Kihwal Lee added a comment -

          Result of running test-patch on 0.20-security:

          [exec] -1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
          [exec] Please justify why no tests are needed for this patch.
          [exec]
          [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
          [exec]

          Show
          Kihwal Lee added a comment - Result of running test-patch on 0.20-security: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec]
          Hide
          Kihwal Lee added a comment -

          Description of the manual testing performed.

          Set up:

          • Namenode runs on "testhost".
          • "testhost" is defined in /etc/hosts. A script switches the IP address of testhost, between 127.0.0.1 and the real IP address of the box.
          • hostname set to testhost.
          • nscd or avahi not running.

          Procedure:

          • Start HDFS.
          • Put files.
          • Call a script to perform FS operations using fs shell.
          • Kill NN. => operations block. RPC.client retry. This is the conn refused case, so they give up quick. The timeout case (node or switch shutdown) will last about 15 min until returning exception.
          • Before clients giving up, switch the IP address and start namenode.

          Result:

          • The outstanding calls (in invoke()) get InvalidRPCAddressException.
          • The clients that were blocked at the RPC initialization unblock and work.
          • The clients that retry on exception will recover on their own.
          Show
          Kihwal Lee added a comment - Description of the manual testing performed. Set up: Namenode runs on "testhost". "testhost" is defined in /etc/hosts. A script switches the IP address of testhost, between 127.0.0.1 and the real IP address of the box. hostname set to testhost. nscd or avahi not running. Procedure: Start HDFS. Put files. Call a script to perform FS operations using fs shell. Kill NN. => operations block. RPC.client retry. This is the conn refused case, so they give up quick. The timeout case (node or switch shutdown) will last about 15 min until returning exception. Before clients giving up, switch the IP address and start namenode. Result: The outstanding calls (in invoke()) get InvalidRPCAddressException. The clients that were blocked at the RPC initialization unblock and work. The clients that retry on exception will recover on their own.
          Hide
          Suresh Srinivas added a comment -

          Kihwal, can you address the -1s from test-patch? Are they related to this patch?

          Show
          Suresh Srinivas added a comment - Kihwal, can you address the -1s from test-patch? Are they related to this patch?
          Hide
          Kihwal Lee added a comment -

          Sure. And I found a bug and also made some improvement. A new patch will be posted soon.

          Show
          Kihwal Lee added a comment - Sure. And I found a bug and also made some improvement. A new patch will be posted soon.
          Hide
          Kihwal Lee added a comment -

          My patch only works for the hfds clients accessing one cluster. It won't work for the things like distcp where map jobs are getting the remote name node address and reverse lookups may not return the name (CNAME or alias) we want, so the addr change detection won't work. Also delegation token will stop working, since the key to the token cache contains the server IP address. So the current approach is not very useful in letting remote clients to deal with the ip address change.

          The final patch will involve more changes.

          Show
          Kihwal Lee added a comment - My patch only works for the hfds clients accessing one cluster. It won't work for the things like distcp where map jobs are getting the remote name node address and reverse lookups may not return the name (CNAME or alias) we want, so the addr change detection won't work. Also delegation token will stop working, since the key to the token cache contains the server IP address. So the current approach is not very useful in letting remote clients to deal with the ip address change. The final patch will involve more changes.
          Hide
          Suresh Srinivas added a comment -

          Either way it's not clean as long as the upper layer keeps using the old resolved address.

          My preference is for the upper layer to not care about the lower layer connection semantics.

          Re-resolving part is fine. But the fact that the callers keep the old address in their immutable InetSocketAddress object can cause problems. Since we cannot afford to check the mapping in every RPC invocation by re-resolving, the connection will go through if NN is started with the old address by mistake. Also the connection cache in RPC Client uses the address as the key. Do we keep the old key? Or update with the new one? Either way it's not clean as long as the upper layer keeps using the old resolved address.

          Upper layers pass the InetSocketAddress down. They do not hold on to it. Can you point to where it is held on to? I am not sure going to old address if NN is restarted is a critical problem that we need to deal with. Given the scenario you are solving, it is unlikely.
          One of the things I was thinking was to replace InetSocketAddress to the underlying layers with a wrapper, which allows updating the address with new resolved address.

          Some comments for the patch:

          1. DFSClient did not have any notion of addresses. It was only in the layer below. The current code handles exception in every RPC call. This repetitive code should be avoided. Also as you noted, this only works for DFS and not for all the RPCs.
          2. IPC Client now introduces new exception. All the implementations that currently use IPC/RPC do not handle this exception gracefully.

          I also think we should create a jira to ensure this is tested in 0.23 and does not break. I am not sure if this should be a blocker for 0.22?

          Show
          Suresh Srinivas added a comment - Either way it's not clean as long as the upper layer keeps using the old resolved address. My preference is for the upper layer to not care about the lower layer connection semantics. Re-resolving part is fine. But the fact that the callers keep the old address in their immutable InetSocketAddress object can cause problems. Since we cannot afford to check the mapping in every RPC invocation by re-resolving, the connection will go through if NN is started with the old address by mistake. Also the connection cache in RPC Client uses the address as the key. Do we keep the old key? Or update with the new one? Either way it's not clean as long as the upper layer keeps using the old resolved address. Upper layers pass the InetSocketAddress down. They do not hold on to it. Can you point to where it is held on to? I am not sure going to old address if NN is restarted is a critical problem that we need to deal with. Given the scenario you are solving, it is unlikely. One of the things I was thinking was to replace InetSocketAddress to the underlying layers with a wrapper, which allows updating the address with new resolved address. Some comments for the patch: DFSClient did not have any notion of addresses. It was only in the layer below. The current code handles exception in every RPC call. This repetitive code should be avoided. Also as you noted, this only works for DFS and not for all the RPCs. IPC Client now introduces new exception. All the implementations that currently use IPC/RPC do not handle this exception gracefully. I also think we should create a jira to ensure this is tested in 0.23 and does not break. I am not sure if this should be a blocker for 0.22?
          Hide
          Suresh Srinivas added a comment -

          I have not looked at HADOOP-7380 in detail. It may be worth taking a look at it.

          Show
          Suresh Srinivas added a comment - I have not looked at HADOOP-7380 in detail. It may be worth taking a look at it.
          Hide
          Kihwal Lee added a comment -

          We are working on a new approach, which will address both 1 and 2.

          Upper layers pass the InetSocketAddress down. They do not hold on to it.

          They don't. It's that they create an InetSocketAddress and the lower layers have no way of knowing what it was originally instantiated with. This is a headache when dealing with tokens.

          One of the things I was thinking was to replace InetSocketAddress to the underlying layers with a wrapper, which allows updating the address with new resolved address.

          We thought about this. Darren is working on the token renewal problem and we found out we can have a common solution. One way was to do what you mentioned. But decided to keep it as is but use createUnresolved() to create an InetSocketAddress, so that we know what was used to instantiate it. If the user slapped in an IP address to begin with, we won't handle it. (I think it was indistinguishable before) The token will have whatever the user used (IP or name) in the beginning and in case of using name, the key to the token cache won't change even with addr changes. So the delegation token should continue to work.

          As for HADOOP-7380, this is for the HA case, where the identity of the two name nodes are known beforehand. The failover proxy is for switching between the pre-configured two. Since this is the HA strategy for 0.23, I don't think this patch will be applicable to the trunk.

          Show
          Kihwal Lee added a comment - We are working on a new approach, which will address both 1 and 2. Upper layers pass the InetSocketAddress down. They do not hold on to it. They don't. It's that they create an InetSocketAddress and the lower layers have no way of knowing what it was originally instantiated with. This is a headache when dealing with tokens. One of the things I was thinking was to replace InetSocketAddress to the underlying layers with a wrapper, which allows updating the address with new resolved address. We thought about this. Darren is working on the token renewal problem and we found out we can have a common solution. One way was to do what you mentioned. But decided to keep it as is but use createUnresolved() to create an InetSocketAddress, so that we know what was used to instantiate it. If the user slapped in an IP address to begin with, we won't handle it. (I think it was indistinguishable before) The token will have whatever the user used (IP or name) in the beginning and in case of using name, the key to the token cache won't change even with addr changes. So the delegation token should continue to work. As for HADOOP-7380 , this is for the HA case, where the identity of the two name nodes are known beforehand. The failover proxy is for switching between the pre-configured two. Since this is the HA strategy for 0.23, I don't think this patch will be applicable to the trunk.
          Hide
          Suresh Srinivas added a comment -

          We are working on a new approach, which will address both 1 and 2.

          Can you add more details.

          They don't. It's that they create an InetSocketAddress and the lower layers have no way of knowing what it was originally instantiated with. This is a headache when dealing with tokens.

          You could consider host name from the InetSocketAddress as the destination identification. I am not sure what you mean by "originally instantiated with" and why treating InetSocketAddress as carrier of host name information will not work.

          This is a headache when dealing with tokens.

          I am not sure what the headache. I will ask Jitendra to comment on this from Security perspective.

          We thought about this. Darren is working on the token renewal problem and we found out we can have a common solution. One way was to do what you mentioned. But decided to keep it as is but use createUnresolved() to create an InetSocketAddress, so that we know what was used to instantiate it. If the user slapped in an IP address to begin with, we won't handle it. (I think it was indistinguishable before)

          Does this mean, you will have different implementation later? Like I said, we could treat InetSocketAddress as carrier of hostname and not attach any other semantics to it. This should be fine because InetSocketAddress is what is passed. And the name(URL) is also resolved to InetSocketAddress.

          The token will have whatever the user used (IP or name) in the beginning and in case of using name, the key to the token cache won't change even with addr changes. So the delegation token should continue to work.

          Jitendra, any comments on this? My thought is, if you wrap InetSocketAddress, appropriate key such as name or host name from InetSocketAddress could be used.

          Show
          Suresh Srinivas added a comment - We are working on a new approach, which will address both 1 and 2. Can you add more details. They don't. It's that they create an InetSocketAddress and the lower layers have no way of knowing what it was originally instantiated with. This is a headache when dealing with tokens. You could consider host name from the InetSocketAddress as the destination identification. I am not sure what you mean by "originally instantiated with" and why treating InetSocketAddress as carrier of host name information will not work. This is a headache when dealing with tokens. I am not sure what the headache. I will ask Jitendra to comment on this from Security perspective. We thought about this. Darren is working on the token renewal problem and we found out we can have a common solution. One way was to do what you mentioned. But decided to keep it as is but use createUnresolved() to create an InetSocketAddress, so that we know what was used to instantiate it. If the user slapped in an IP address to begin with, we won't handle it. (I think it was indistinguishable before) Does this mean, you will have different implementation later? Like I said, we could treat InetSocketAddress as carrier of hostname and not attach any other semantics to it. This should be fine because InetSocketAddress is what is passed. And the name(URL) is also resolved to InetSocketAddress. The token will have whatever the user used (IP or name) in the beginning and in case of using name, the key to the token cache won't change even with addr changes. So the delegation token should continue to work. Jitendra, any comments on this? My thought is, if you wrap InetSocketAddress, appropriate key such as name or host name from InetSocketAddress could be used.
          Hide
          Suresh Srinivas added a comment -

          I missed a comment:

          > As for HADOOP-7380, this is for the HA case, where the identity of the two name nodes are known beforehand. The failover proxy is for switching between the pre-configured two. Since this is the HA strategy for 0.23, I don't think this patch will be applicable to the trunk.
          I know that this is solving HA case. However, I wanted to point this out, to see if similar change can only be done in the lower layers.

          Show
          Suresh Srinivas added a comment - I missed a comment: > As for HADOOP-7380 , this is for the HA case, where the identity of the two name nodes are known beforehand. The failover proxy is for switching between the pre-configured two. Since this is the HA strategy for 0.23, I don't think this patch will be applicable to the trunk. I know that this is solving HA case. However, I wanted to point this out, to see if similar change can only be done in the lower layers.
          Hide
          Kihwal Lee added a comment -

          Like I said, we could treat InetSocketAddress as carrier of hostname and not attach any other semantics to it. This should be fine because InetSocketAddress is what is passed. And the name(URL) is also resolved to InetSocketAddress.

          This is pretty much where we are going. Since we are only honoring the hostname part of InetSocketAddress, ipc.Client.setupConnection() will always try to do fresh lookup. Is it okay? Since this is not per call, I assume this is reasonable.

          Show
          Kihwal Lee added a comment - Like I said, we could treat InetSocketAddress as carrier of hostname and not attach any other semantics to it. This should be fine because InetSocketAddress is what is passed. And the name(URL) is also resolved to InetSocketAddress. This is pretty much where we are going. Since we are only honoring the hostname part of InetSocketAddress, ipc.Client.setupConnection() will always try to do fresh lookup. Is it okay? Since this is not per call, I assume this is reasonable.
          Hide
          Suresh Srinivas added a comment -

          +1 from me.

          Show
          Suresh Srinivas added a comment - +1 from me.
          Hide
          Kihwal Lee added a comment -

          The only thing different from what you suggested is the use of InetSocketAddress.createUnresolvedSocketAddress(). This is to keep the original string that the caller has passed. This is necessary for the token cache to work. Daryn is making the token cache changes.

          Show
          Kihwal Lee added a comment - The only thing different from what you suggested is the use of InetSocketAddress.createUnresolvedSocketAddress(). This is to keep the original string that the caller has passed. This is necessary for the token cache to work. Daryn is making the token cache changes.
          Hide
          Suresh Srinivas added a comment -

          Which jira is Daryn making this change in?

          Show
          Suresh Srinivas added a comment - Which jira is Daryn making this change in?
          Hide
          Kihwal Lee added a comment -

          I think it is MAPREDUCE-1824. The common changes will likely go there and make this Jira dependent on his.

          Show
          Kihwal Lee added a comment - I think it is MAPREDUCE-1824 . The common changes will likely go there and make this Jira dependent on his.
          Hide
          Suresh Srinivas added a comment -

          I have not looked at MR-1824 in detail. What change will be done in this jira and what comes out of MR-1824?

          Show
          Suresh Srinivas added a comment - I have not looked at MR-1824 in detail. What change will be done in this jira and what comes out of MR-1824?
          Hide
          Kihwal Lee added a comment -

          InetSocketAddress.createUnresolved() will be used (in netutils) to replace regular instantiation of InetSocketAddress by upper layer callers. This InetSocketAddress object will stay unresolved and the hostname field will contain the string passed to it during instantiation, whether it's IP address or hostname.

          For making bare connections to the new namenode, this is not necessary, but for the token cache to work, the original string used for creating the InetSocketAddress is needed. My patch (yet to be posted) alone will work if the security is disabled. After MAPREDUCE-1824, my patch will be very simple, much like what you initially suggested.

          Show
          Kihwal Lee added a comment - InetSocketAddress.createUnresolved() will be used (in netutils) to replace regular instantiation of InetSocketAddress by upper layer callers. This InetSocketAddress object will stay unresolved and the hostname field will contain the string passed to it during instantiation, whether it's IP address or hostname. For making bare connections to the new namenode, this is not necessary, but for the token cache to work, the original string used for creating the InetSocketAddress is needed. My patch (yet to be posted) alone will work if the security is disabled. After MAPREDUCE-1824 , my patch will be very simple, much like what you initially suggested.
          Hide
          Suresh Srinivas added a comment -

          Thanks for the update Kihwal.

          Show
          Suresh Srinivas added a comment - Thanks for the update Kihwal.
          Hide
          Jitendra Nath Pandey added a comment -

          How does MAPREDUCE-1824 relate to this one?

          Show
          Jitendra Nath Pandey added a comment - How does MAPREDUCE-1824 relate to this one?
          Hide
          Kihwal Lee added a comment -

          How does MAPREDUCE-1824 relate to this one?

          The major use case is that the cluster is restarted after an name node address change, but it's always easy to restart the clients outside the cluster. One of the examples is a inter-cluster data management system built on top of distcp. In this case, the distcp map jobs should continue to work and this requires proper handling of delegation tokens. Without it, this work cannot support major use cases.

          Show
          Kihwal Lee added a comment - How does MAPREDUCE-1824 relate to this one? The major use case is that the cluster is restarted after an name node address change, but it's always easy to restart the clients outside the cluster. One of the examples is a inter-cluster data management system built on top of distcp. In this case, the distcp map jobs should continue to work and this requires proper handling of delegation tokens. Without it, this work cannot support major use cases.
          Hide
          Kihwal Lee added a comment -

          .bq but it's always easy to restart the clients outside the cluster.
          Sorry, it should read "it's not always easy to restart the clients outside the cluster".

          Show
          Kihwal Lee added a comment - .bq but it's always easy to restart the clients outside the cluster. Sorry, it should read "it's not always easy to restart the clients outside the cluster".
          Hide
          Kihwal Lee added a comment -

          The new patch is very close to what Suresh intially suggested. I decided not to do the check at response timeout because it changes the service semantics.

          Following is the result of my testing described above.

          $ ./hadoop fs -ls
          11/08/01 16:53:20 INFO ipc.Client: Retrying connect to server: testhost/127.0.0.1:9000. Already tried 0 time(s).
          11/08/01 16:53:21 INFO ipc.Client: Retrying connect to server: testhost/127.0.0.1:9000. Already tried 1 time(s).
          11/08/01 16:53:22 INFO ipc.Client: Retrying connect to server: testhost/127.0.0.1:9000. Already tried 2 time(s).
          11/08/01 16:53:23 INFO ipc.Client: Retrying connect to server: testhost/127.0.0.1:9000. Already tried 3 time(s).
          11/08/01 16:53:23 WARN ipc.Client: Address change detected. Host: testhost OldAddr: 127.0.0.1 NewAddr: testhost/10.xx.xx.xx
          11/08/01 16:53:24 INFO ipc.Client: Retrying connect to server: testhost/10.xx.xx.xx:9000. Already tried 0 time(s).
          11/08/01 16:53:25 INFO ipc.Client: Retrying connect to server: testhost/10.xx.xx.xx:9000. Already tried 1 time(s).
          11/08/01 16:53:26 INFO ipc.Client: Retrying connect to server: testhost/10.xx.xx.xx:9000. Already tried 2 time(s).
          11/08/01 16:53:27 INFO ipc.Client: Retrying connect to server: testhost/10.xx.xx.xx:9000. Already tried 3 time(s).
          Found 1 items
          rw-rr- 1 kihwal supergroup 327499776 2011-07-22 11:30 /user/kihwal/ddd

          The result of test-patch:

          [exec] -1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
          [exec] Please justify why no tests are needed for this patch.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          When the patch for MAPREDUCE-1824 becomes available, there might be an overlap where connect() is called.

          Show
          Kihwal Lee added a comment - The new patch is very close to what Suresh intially suggested. I decided not to do the check at response timeout because it changes the service semantics. Following is the result of my testing described above. $ ./hadoop fs -ls 11/08/01 16:53:20 INFO ipc.Client: Retrying connect to server: testhost/127.0.0.1:9000. Already tried 0 time(s). 11/08/01 16:53:21 INFO ipc.Client: Retrying connect to server: testhost/127.0.0.1:9000. Already tried 1 time(s). 11/08/01 16:53:22 INFO ipc.Client: Retrying connect to server: testhost/127.0.0.1:9000. Already tried 2 time(s). 11/08/01 16:53:23 INFO ipc.Client: Retrying connect to server: testhost/127.0.0.1:9000. Already tried 3 time(s). 11/08/01 16:53:23 WARN ipc.Client: Address change detected. Host: testhost OldAddr: 127.0.0.1 NewAddr: testhost/10.xx.xx.xx 11/08/01 16:53:24 INFO ipc.Client: Retrying connect to server: testhost/10.xx.xx.xx:9000. Already tried 0 time(s). 11/08/01 16:53:25 INFO ipc.Client: Retrying connect to server: testhost/10.xx.xx.xx:9000. Already tried 1 time(s). 11/08/01 16:53:26 INFO ipc.Client: Retrying connect to server: testhost/10.xx.xx.xx:9000. Already tried 2 time(s). 11/08/01 16:53:27 INFO ipc.Client: Retrying connect to server: testhost/10.xx.xx.xx:9000. Already tried 3 time(s). Found 1 items rw-r r - 1 kihwal supergroup 327499776 2011-07-22 11:30 /user/kihwal/ddd The result of test-patch: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. When the patch for MAPREDUCE-1824 becomes available, there might be an overlap where connect() is called.
          Hide
          Suresh Srinivas added a comment -

          Client#connections is a map of ConnectionId to Connection. ConnectionId hash code uses the address corresponding to InetSocketAddres, along with other keys (see ConnectionId#hashCode().

          Before failure, map has ConnectionId (address x) mapping to Connection(address x)
          On failure recovery, map has ConnectionId (address x) mapping to Connection(address y).

          At this point in time:

          1. Connection close actually correctly removes the map entry using ConnectionId.
          2. A new ConnectionId(address x) still finds the cached Connection(address y)
          3. A new ConnectionId(address y) will not find cached Connection(address y).

          On encountering 3) for the first time, a new ConnectionId(address y) to new Connection(address y) will be added to the map. This is not a problem. Just wanted to see if you think of any issues I might be missing, given my description.

          Minor comment on the patch. isAddressChanged(boolean update), is always called with update set to true. Do you think it would be better to change this method name to updateAddress() with no args? I also feel instead of Check whether the hostname:address mapping is still valid., we could add some thing like Update the address corresponding to server if the address corresponding to the host name has changed.

          Show
          Suresh Srinivas added a comment - Client#connections is a map of ConnectionId to Connection. ConnectionId hash code uses the address corresponding to InetSocketAddres, along with other keys (see ConnectionId#hashCode(). Before failure, map has ConnectionId (address x) mapping to Connection(address x) On failure recovery, map has ConnectionId (address x) mapping to Connection(address y). At this point in time: Connection close actually correctly removes the map entry using ConnectionId. A new ConnectionId(address x) still finds the cached Connection(address y) A new ConnectionId(address y) will not find cached Connection(address y). On encountering 3) for the first time, a new ConnectionId(address y) to new Connection(address y) will be added to the map. This is not a problem. Just wanted to see if you think of any issues I might be missing, given my description. Minor comment on the patch. isAddressChanged(boolean update), is always called with update set to true. Do you think it would be better to change this method name to updateAddress() with no args? I also feel instead of Check whether the hostname:address mapping is still valid. , we could add some thing like Update the address corresponding to server if the address corresponding to the host name has changed.
          Hide
          Kihwal Lee added a comment -

          If the same caller having the same ugi calls RPC.getProxy() with a newly resolved InetSocketAddress, #3 can happen. So there can be duplicate connections to the same server for the same user. If the old connection is no longer used, the connection will eventually get terminated from the server side.

          But Daryn's patch will make the callers to use InetSocketAddress.createUnresolved(). This makes #3 go away since the IP address will not be a part of ConnectionID, unless the user hardcodes it.

          I will incorporate your comment on isAddressChanged().

          Show
          Kihwal Lee added a comment - If the same caller having the same ugi calls RPC.getProxy() with a newly resolved InetSocketAddress, #3 can happen. So there can be duplicate connections to the same server for the same user. If the old connection is no longer used, the connection will eventually get terminated from the server side. But Daryn's patch will make the callers to use InetSocketAddress.createUnresolved(). This makes #3 go away since the IP address will not be a part of ConnectionID, unless the user hardcodes it. I will incorporate your comment on isAddressChanged().
          Hide
          Kihwal Lee added a comment -

          Result of test-patch

          [exec] -1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
          [exec] Please justify why no tests are needed for this patch.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
          [exec]

          Show
          Kihwal Lee added a comment - Result of test-patch [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec]
          Hide
          Suresh Srinivas added a comment -

          Do you think it would be better to change this method name to updateAddress() with no args?

          Are you planning to address this?

          Show
          Suresh Srinivas added a comment - Do you think it would be better to change this method name to updateAddress() with no args? Are you planning to address this?
          Hide
          Kihwal Lee added a comment -

          Are you planning to address this?

          The new patch has the change.

          Show
          Kihwal Lee added a comment - Are you planning to address this? The new patch has the change.
          Hide
          Kihwal Lee added a comment -

          The patch for TRUNK is attached. I ran tests and org.apache.hadoop.fs.TestFilterFileSystem failed. It seems it is due to unimplemented method and failing all the time lately. This is not related to this patch.

          Show
          Kihwal Lee added a comment - The patch for TRUNK is attached. I ran tests and org.apache.hadoop.fs.TestFilterFileSystem failed. It seems it is due to unimplemented method and failing all the time lately. This is not related to this patch.
          Hide
          Kihwal Lee added a comment -

          New patches are attached. The method name change is the only change.

          Show
          Kihwal Lee added a comment - New patches are attached. The method name change is the only change.
          Hide
          Kihwal Lee added a comment -

          The result of test-patch for the 0.20-security patch

          [exec] -1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
          [exec] Please justify why no tests are needed for this patch.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          Show
          Kihwal Lee added a comment - The result of test-patch for the 0.20-security patch [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
          Hide
          Daryn Sharp added a comment -

          Could the patch be simplified to something like:

          private synchronized void updateAddress() {
            InetSocketAddress currentInetAddr = new InetSocketAddress(server.getHostName(), server.getPort());
            if (!server.equals(currentInetAddr)) {
              LOG.warn(...);
              server = currentInetAddr;
              timeoutFailures = ioFailures = 0;
            }
          }
          

          Then call this method in handleConnectionFailure(...)?

          Show
          Daryn Sharp added a comment - Could the patch be simplified to something like: private synchronized void updateAddress() { InetSocketAddress currentInetAddr = new InetSocketAddress(server.getHostName(), server.getPort()); if (!server.equals(currentInetAddr)) { LOG.warn(...); server = currentInetAddr; timeoutFailures = ioFailures = 0; } } Then call this method in handleConnectionFailure(...) ?
          Hide
          Kihwal Lee added a comment -

          We both are wrong

          We lose the original host name in both cases. The name shuld be saved early on. The hostname field in InetSocketAddress and InetAddress are NOT populated during instantiation. They are filled in only after getHostName() which does a reverse lookup, and that's not what we want. We need to keep the original name.

          I will update the patch.

          Show
          Kihwal Lee added a comment - We both are wrong We lose the original host name in both cases. The name shuld be saved early on. The hostname field in InetSocketAddress and InetAddress are NOT populated during instantiation. They are filled in only after getHostName() which does a reverse lookup, and that's not what we want. We need to keep the original name. I will update the patch.
          Hide
          Kihwal Lee added a comment -

          If you call InetAddress.getByAddress(String hostname, byte[] addr), the hostname is saved in the new InetAddress, so getHostName() will return the hostname used to instantiate this InetAddress object. But if InetAddress.getByName() is used, only the resolved address is saved. The hostname is saved in h_name of struct hostent, but it is never looked at.

          So, new InetSocketAddress(name, port) will not store {{name}] in its hostName field nor in InetAddress.hostname.

          I need to do lookup with the old name, and then construct an InetAddress with the old name and the new address. Only then I can create a new InetSocketAddress with it and InetSocketAddress.getHostName() will get the name from the InetAddress I created.

          This assumes that the InetSocketAddress passed to getProxy() contains the name or the alias that stays the same after IP address changes. My very first patch tried to address this in DFSClient, but it was limited in scope. MAPREDUCE-2764 will have more comprehensive solution.

          Show
          Kihwal Lee added a comment - If you call InetAddress.getByAddress(String hostname, byte[] addr), the hostname is saved in the new InetAddress, so getHostName() will return the hostname used to instantiate this InetAddress object. But if InetAddress.getByName() is used, only the resolved address is saved. The hostname is saved in h_name of struct hostent , but it is never looked at. So, new InetSocketAddress(name, port) will not store {{name}] in its hostName field nor in InetAddress.hostname. I need to do lookup with the old name, and then construct an InetAddress with the old name and the new address. Only then I can create a new InetSocketAddress with it and InetSocketAddress.getHostName() will get the name from the InetAddress I created. This assumes that the InetSocketAddress passed to getProxy() contains the name or the alias that stays the same after IP address changes. My very first patch tried to address this in DFSClient, but it was limited in scope. MAPREDUCE-2764 will have more comprehensive solution.
          Hide
          Kihwal Lee added a comment -

          Here is the corrected version.

              private synchronized boolean updateAddress() throws IOException {
                String oldAddr = server.getAddress().getHostAddress();
                String hostName = server.getHostName();
                InetAddress currentInetAddr = InetAddress.getByName(hostName);
          
                if (!oldAddr.equals(currentInetAddr.getHostAddress())) {
                  // Create an inetAddress obj with the original hostname and the
                  // new address. It makes getHostName() return the original host
                  // name instead of doing a reverse lookup.
                  InetAddress newInetAddress = InetAddress.getByAddress(hostName,
                                                   currentInetAddr.getAddress());
                  server = new InetSocketAddress(newInetAddr, server.getPort());
                  LOG.warn("Address change detected. Host: " + hostName + " OldAddr: " +
                            oldAddr + " NewAddr: " + currentInetAddr.toString());
                  return true;
                }
                return false;
              }
          
          Show
          Kihwal Lee added a comment - Here is the corrected version. private synchronized boolean updateAddress() throws IOException { String oldAddr = server.getAddress().getHostAddress(); String hostName = server.getHostName(); InetAddress currentInetAddr = InetAddress.getByName(hostName); if (!oldAddr.equals(currentInetAddr.getHostAddress())) { // Create an inetAddress obj with the original hostname and the // new address. It makes getHostName() return the original host // name instead of doing a reverse lookup. InetAddress newInetAddress = InetAddress.getByAddress(hostName, currentInetAddr.getAddress()); server = new InetSocketAddress(newInetAddr, server.getPort()); LOG.warn("Address change detected. Host: " + hostName + " OldAddr: " + oldAddr + " NewAddr: " + currentInetAddr.toString()); return true; } return false; }
          Hide
          Kihwal Lee added a comment -

          I reread the JDK code, InetAddress.getByName(hostName) saves the hostname in Inet4Address. So Daryn's suggestion will actually work.

          Show
          Kihwal Lee added a comment - I reread the JDK code, InetAddress.getByName(hostName) saves the hostname in Inet4Address. So Daryn's suggestion will actually work.
          Hide
          Kihwal Lee added a comment -

          Okay, I will do the following. This is functionally equivalent to what I had before, but is simpler looking.

              private synchronized boolean updateAddress() throws IOException {
                // Do a fresh lookup with the old host name.
                InetSocketAddress currentAddr =  new InetSocketAddress(
                                         server.getHostName(), server.getPort());
          
                if (!server.equals(currentAddr)) {
                  LOG.warn("Address change detected. Old: " + server.toString() +
                                           " New: " + currentAddr.toString());
                  server = currentAddr;
                  return true;
                }
                return false;
              }
          
          Show
          Kihwal Lee added a comment - Okay, I will do the following. This is functionally equivalent to what I had before, but is simpler looking. private synchronized boolean updateAddress() throws IOException { // Do a fresh lookup with the old host name. InetSocketAddress currentAddr = new InetSocketAddress( server.getHostName(), server.getPort()); if (!server.equals(currentAddr)) { LOG.warn("Address change detected. Old: " + server.toString() + " New: " + currentAddr.toString()); server = currentAddr; return true; } return false; }
          Hide
          Kihwal Lee added a comment -

          The new patches based on the changes shown above.

          Show
          Kihwal Lee added a comment - The new patches based on the changes shown above.
          Hide
          Daryn Sharp added a comment -

          +1 Minor concern about round-robin dns causing connection attempts forever.

          Just so everyone understands the caveats:

          1. A process that is running when the IP changes will reconnect and continue to use its tokens since it knows how it original contacted the service – new behavior
          2. New processes using a token cache (ie. mapreduce) will be unable to find their tokens since the token is keyed on IP – same as before
          3. Job tracker will be unable to renew tokens when the ip changes – same as before.

          What it means is this change is useful in cases such as the NN IP changes. Existing mappers will reconnect & complete, but mappers spawned after the IP change won't find their tokens.

          Show
          Daryn Sharp added a comment - +1 Minor concern about round-robin dns causing connection attempts forever. Just so everyone understands the caveats: A process that is running when the IP changes will reconnect and continue to use its tokens since it knows how it original contacted the service – new behavior New processes using a token cache (ie. mapreduce) will be unable to find their tokens since the token is keyed on IP – same as before Job tracker will be unable to renew tokens when the ip changes – same as before. What it means is this change is useful in cases such as the NN IP changes. Existing mappers will reconnect & complete, but mappers spawned after the IP change won't find their tokens.
          Hide
          Kihwal Lee added a comment -

          How this Jira depends on MAPREDUCE-2764

          For the patch in this Jira to work, the upper layer (e.g. DFSClient) must use the host name or CNAME(alias) that is intended to locate the service when instantiating InetSocketAddress of the RPC server (e.g. namenode). Otherwise, detecting an IP address change may not work. Unfortunately this is not always possible because certain components force only the IP address string to be used for creating InetSocketAddress. A reverse lookup after this may not return the original host name or alias.

          An early attempt was made to ensure the correct behavior for DFSClient, but it was quickly realized that the tokens won't work due to the use of cached IP address in token renewals. For this reason, MAPREDUCE-2764 must deal with the same problem (caching and/or exclusive use of IP address) in more broader scope.

          In more detail, MAPREDUCE-2764 affects this Jira in following ways:

          • Provide a way to keep the original host name string the user used and let the RPC Client access it. This is critical for the reliable detection of IP address changes.
          • Let the token cache and token renewal work regardless of IP address changes. This enables this Jira to support broader range of clients/apps, especially long running ones.

          This jira (IP address change detection) tries to achieve a somewhat better error-recovery by avoiding restart of some components. The initial target is very narrow, but future work may broaden the scope.

          Show
          Kihwal Lee added a comment - How this Jira depends on MAPREDUCE-2764 For the patch in this Jira to work, the upper layer (e.g. DFSClient) must use the host name or CNAME(alias) that is intended to locate the service when instantiating InetSocketAddress of the RPC server (e.g. namenode). Otherwise, detecting an IP address change may not work. Unfortunately this is not always possible because certain components force only the IP address string to be used for creating InetSocketAddress. A reverse lookup after this may not return the original host name or alias. An early attempt was made to ensure the correct behavior for DFSClient, but it was quickly realized that the tokens won't work due to the use of cached IP address in token renewals. For this reason, MAPREDUCE-2764 must deal with the same problem (caching and/or exclusive use of IP address) in more broader scope. In more detail, MAPREDUCE-2764 affects this Jira in following ways: Provide a way to keep the original host name string the user used and let the RPC Client access it. This is critical for the reliable detection of IP address changes. Let the token cache and token renewal work regardless of IP address changes. This enables this Jira to support broader range of clients/apps, especially long running ones. This jira (IP address change detection) tries to achieve a somewhat better error-recovery by avoiding restart of some components. The initial target is very narrow, but future work may broaden the scope.
          Hide
          Kihwal Lee added a comment -

          Minor concern about round-robin dns causing connection attempts forever.

          I am curious whether people actually use RR dns against NN. There might be cases where multiple nics on one machine are load balanced that way, but I would think bonding does the better job since the RPC connections are relatively long-lived compared to things like http requests.

          Show
          Kihwal Lee added a comment - Minor concern about round-robin dns causing connection attempts forever. I am curious whether people actually use RR dns against NN. There might be cases where multiple nics on one machine are load balanced that way, but I would think bonding does the better job since the RPC connections are relatively long-lived compared to things like http requests.
          Hide
          Daryn Sharp added a comment -

          Regarding my dns comment, it's just Murphy's Law... A likely scenario today is a site with wildcard dns. If multiple vips are returned, a typo in a hostname will cause the rpc client to pound away on the vips forever.

          Note: The changes for supporting the use of the original hostname used in the connection have been split out from MAPREDUCE-2764 to HADOOP-7510.

          Show
          Daryn Sharp added a comment - Regarding my dns comment, it's just Murphy's Law... A likely scenario today is a site with wildcard dns. If multiple vips are returned, a typo in a hostname will cause the rpc client to pound away on the vips forever. Note: The changes for supporting the use of the original hostname used in the connection have been split out from MAPREDUCE-2764 to HADOOP-7510 .
          Hide
          Kihwal Lee added a comment -

          I removed the dependency on MAPREDUCE-2764 because the patch still does its job for a set of components/services. Daryn agrees that this jira is useful and funtional as is and can be independently committed. The ones I verified working (automatic recovery after the RPC server IP address change) are

          • DistributedFileSystem (when used without delegation tokens)
          • Datanode to Namenode heartbeat

          For other components, this patch does not add more failures. MAPREDUCE-2764 will be a valuable improvement and other components using the RPC clients may also be improved once MAPREDUCE-2764 is committed.

          Show
          Kihwal Lee added a comment - I removed the dependency on MAPREDUCE-2764 because the patch still does its job for a set of components/services. Daryn agrees that this jira is useful and funtional as is and can be independently committed. The ones I verified working (automatic recovery after the RPC server IP address change) are DistributedFileSystem (when used without delegation tokens) Datanode to Namenode heartbeat For other components, this patch does not add more failures. MAPREDUCE-2764 will be a valuable improvement and other components using the RPC clients may also be improved once MAPREDUCE-2764 is committed.
          Hide
          Suresh Srinivas added a comment -

          +1 for the patch.

          One concern I have with the patch is, when the address changes, depending on the exception that triggers address change detection, either timeoutFailures or ioFailures are reset to zero. Should we reset both of them?

          Show
          Suresh Srinivas added a comment - +1 for the patch. One concern I have with the patch is, when the address changes, depending on the exception that triggers address change detection, either timeoutFailures or ioFailures are reset to zero. Should we reset both of them?
          Hide
          Kihwal Lee added a comment -

          The result of test-patch is identical to above. On trunk, one additional test fails:
          org.apache.hadoop.util.TestShell
          But it fails before my patch was applied.

          Show
          Kihwal Lee added a comment - The result of test-patch is identical to above. On trunk, one additional test fails: org.apache.hadoop.util.TestShell But it fails before my patch was applied.
          Hide
          Kihwal Lee added a comment -

          [exec] -1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
          [exec] Please justify why no tests are needed for this patch.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          Show
          Kihwal Lee added a comment - [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
          Hide
          Kihwal Lee added a comment -

          For Trunk, mvn clean install -Ptar -Ptest-patch was run.
          Results :

          Tests in error:

          Tests run: 1334, Failures: 0, Errors: 1, Skipped: 0

          [INFO] ------------------------------------------------------------------------
          [INFO] BUILD FAILURE
          [INFO] ------------------------------------------------------------------------
          [INFO] Total time: 6:58.706s
          [INFO] Finished at: Tue Aug 09 17:21:52 CDT 2011
          [INFO] Final Memory: 10M/52M
          [INFO] ------------------------------------------------------------------------

          The following is the failed test, which also fails without this patch.

          Running org.apache.hadoop.fs.TestFilterFileSystem
          Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.246 sec <<< FAILURE!

          The justification for missing test was given in previous comments. I see a better chance of having a meaningful test in trunk than in 0.20-security. I will file a separate Jira for potentially introducing new packages that enables such a test.

          Show
          Kihwal Lee added a comment - For Trunk, mvn clean install -Ptar -Ptest-patch was run. Results : Tests in error: Tests run: 1334, Failures: 0, Errors: 1, Skipped: 0 [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 6:58.706s [INFO] Finished at: Tue Aug 09 17:21:52 CDT 2011 [INFO] Final Memory: 10M/52M [INFO] ------------------------------------------------------------------------ The following is the failed test, which also fails without this patch. Running org.apache.hadoop.fs.TestFilterFileSystem Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.246 sec <<< FAILURE! The justification for missing test was given in previous comments. I see a better chance of having a meaningful test in trunk than in 0.20-security. I will file a separate Jira for potentially introducing new packages that enables such a test.
          Hide
          Suresh Srinivas added a comment -

          I committed the patch to trunk and 0.20.security. Thank you Kihwal.

          Show
          Suresh Srinivas added a comment - I committed the patch to trunk and 0.20.security. Thank you Kihwal.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #721 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/721/)
          HADOOP-7472. RPC client should deal with IP address change. Contributed by Kihwal Lee.

          suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1156350
          Files :

          • /hadoop/common/trunk/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #721 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/721/ ) HADOOP-7472 . RPC client should deal with IP address change. Contributed by Kihwal Lee. suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1156350 Files : /hadoop/common/trunk/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java
          Hide
          Matt Foley added a comment -

          Closed upon release of 0.20.205.0

          Show
          Matt Foley added a comment - Closed upon release of 0.20.205.0

            People

            • Assignee:
              Kihwal Lee
              Reporter:
              Kihwal Lee
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development