Accumulo
  1. Accumulo
  2. ACCUMULO-1585

Provide option for FQDN/verbatim data from config files of servers to be stored in ZooKeeper rather than resolved IP

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: tserver
    • Labels:
      None
    • Environment:

      All

      Description

      There are some situations (esp in virtualized/cloud environments) where "hardwiring" the IP into ZooKeeper can create reachability issues and an FQDN (or, better/also, the verbatim string/line from the concerned config file) would fix this problem.

      For example, hostname node1.company.com specified in configuration files resolves to an Amazon EC2 internal IP of 10.2.3.4 (internal on virtualized network). Externally (e.g. from your dev machine, your offsite/non-VPN/non-VPCed data center, other client machines on different networks/clouds), node1.company.com will resolve to a public IP (e.g. Amazon Elastic IP, etc) of something more routeable, like 54.55.56.57.

      Accumulo currently stores 10.2.3.4 in ZooKeeper based on this resolution, but, if you try to connect to Accumulo from outside these machines/machines in the same cloud/vitualized network/non routeable network, and the same FQDN (node1.company.com) resolves to the public address now (54.55.56.57), you will not be able to connect, because the Accumulo client will have pulled the resolved, and from here, unreachable, IP of 10.2.3.4.

      Using the FQDN (or in some other way allowing for client-side name resolution/address translation, although this seems kludgy) would fix this issue in a relatively standard way. Ideally, this would not incur a performance issue beyond the first resolution assuming the TCP/IP stack is doing its job and caching stuff effectively (I assume).

      This doesn't really hurt/break things if you give an option in some config, and, really, taking the literal from the file allows you to use whatever you want, the ultimate in flexibility.

      See discussion http://mail-archives.apache.org/mod_mbox/accumulo-user/201307.mbox/%3CCAGFNOZTMVz0R2e0meDj%3DKqPPPJP6f5baaMqh8%3D07V7NZ8vToJg%40mail.gmail.com%3E for more details and others having the same issue.

      I will look into creating a patch for this as soon as I have some time to find/look at relevant code portions (I need to find where accumulo is making these writes to ZK and if the read FQDNs would need any resolution/their use further down the line expects strictly IP or is in host or IP safe API calls, etc). Any suggestions on where I can begin this are always appreciated. Otherwise, I'll try and submit a patch when I can.

      Figure I'd open this issue to at least provide a discussion on what more experienced Accumulo devs and users think and what a solution based on the style/patterns accepted for Accumulo development/configuration are. I can read the guidelines myself, of course, and will, but someone suggested opening an issue, so I am...

        Issue Links

          Activity

          Hide
          Eric Newton added a comment -

          I'm signing up to do this, but I will take all the help I can get. I'll make it a priority to review/patch. If you can do it yourself, please reassign the ticket.

          Is there some approach we should take, similar to the way that datanodes self-identify to the namenode? That might be even better than the FQDN.

          Show
          Eric Newton added a comment - I'm signing up to do this, but I will take all the help I can get. I'll make it a priority to review/patch. If you can do it yourself, please reassign the ticket. Is there some approach we should take, similar to the way that datanodes self-identify to the namenode? That might be even better than the FQDN.
          Hide
          Basit Mustafa added a comment -

          Hi Eric,

          Thank you for volunteering to lead this, I have not made any changes to the Accumulo source, I haven't even really dug into the internals quite a bit, although have developed for it and deployed it in test and production. I'm an experienced Java developer, just don't know the ins and outs and way around the Accumulo source, but am willing to help however I can.

          To answer your question, the problem described in the discussion thread would likely be addressed by the pattern you discussed/mentioned in how HDFS data nodes self-identify using the namespaceID to the namenode, but I don't think it addresses the case I mention about network-related reachability when a host might resolve to multiple IPs (and not because of load balancing, DNS round-robining, etc, which, really should not be used between Accumulo nodes at all, and that is not the use case I'm mentioning here, which is more related to reachability when a name resolves to an internal network by a process running on a machine in a given environment and to a different public IP when running outside that environment, although both addresses truly terminate at the same machine/instance).

          I cannot envisage a standards/globally applicable solution other than FQDN to this situation that doesn't reinvent the wheel/purpose of FQDN-based resolution. The crux of the reachability issue is really that the result of the resolution taking place on the (arbitrary?) node that populates ZooKeeper on start/state change rather than each individual node being responsible to do the resolution and trusting DNS to give them the IP that gets them to their host (in this situation, using an internal IP between machines inside an EC2 availability zone and/or rack/hypervisor in your own cloud is highly desirable versus having to go out to a public IP because you are on in-memory/backplane virtual interfaces that are very fast and zero cost, and hitting the public IP [absent any optimization done by the stack, which is rare, and most certainly not done on EC2/most hypervisors I know] will route the traffic to at least the closest edge device hurting performance, and on EC2 incurring regional data transfer charges, actually, this "schizophrenia" based on where the client, tablet server, etc process is in relation to where the initial resolution of configured hosts took place is what creates the reachability issue). So, while an HDFS DN-NN type self-ident system might work to address an issue of simply changing IPs without reachability or such private/public "schizophrenia", in my mental model it doesn't seem to address this reachability case, which, in the environment we're working and deploying in, is actually quite common.

          I think the approach that this be the default behavior (e.g. that the verbatim string from the config file is entered into ZooKeeper, whatever the user configured gc/monitor/tracer/masters/slaves file with) is best, it is predictable, the current behavior is non-standard in my experience (e.g. that the system would without good reason further process/resolve my input to a config file and write to ZooKeeper the resolved address, now, I am a newb to Accumulo and there very likely might be a very good design reason this is done, in which case, I would happily listen/learn from that and brainstorm other solutions such as a flag/config opt to enable such non-resolution behavior at the cost of whatever the benefit of this very good reason might be, but with my current knowledge I think telling the system to "not touch" the entered network name makes less sense than a flag telling it to do so, again, this is in my Accumulo-ignorant state).

          Show
          Basit Mustafa added a comment - Hi Eric, Thank you for volunteering to lead this, I have not made any changes to the Accumulo source, I haven't even really dug into the internals quite a bit, although have developed for it and deployed it in test and production. I'm an experienced Java developer, just don't know the ins and outs and way around the Accumulo source, but am willing to help however I can. To answer your question, the problem described in the discussion thread would likely be addressed by the pattern you discussed/mentioned in how HDFS data nodes self-identify using the namespaceID to the namenode, but I don't think it addresses the case I mention about network-related reachability when a host might resolve to multiple IPs (and not because of load balancing, DNS round-robining, etc, which, really should not be used between Accumulo nodes at all, and that is not the use case I'm mentioning here, which is more related to reachability when a name resolves to an internal network by a process running on a machine in a given environment and to a different public IP when running outside that environment, although both addresses truly terminate at the same machine/instance). I cannot envisage a standards/globally applicable solution other than FQDN to this situation that doesn't reinvent the wheel/purpose of FQDN-based resolution. The crux of the reachability issue is really that the result of the resolution taking place on the (arbitrary?) node that populates ZooKeeper on start/state change rather than each individual node being responsible to do the resolution and trusting DNS to give them the IP that gets them to their host (in this situation, using an internal IP between machines inside an EC2 availability zone and/or rack/hypervisor in your own cloud is highly desirable versus having to go out to a public IP because you are on in-memory/backplane virtual interfaces that are very fast and zero cost, and hitting the public IP [absent any optimization done by the stack, which is rare, and most certainly not done on EC2/most hypervisors I know] will route the traffic to at least the closest edge device hurting performance, and on EC2 incurring regional data transfer charges, actually, this "schizophrenia" based on where the client, tablet server, etc process is in relation to where the initial resolution of configured hosts took place is what creates the reachability issue). So, while an HDFS DN-NN type self-ident system might work to address an issue of simply changing IPs without reachability or such private/public "schizophrenia", in my mental model it doesn't seem to address this reachability case, which, in the environment we're working and deploying in, is actually quite common. I think the approach that this be the default behavior (e.g. that the verbatim string from the config file is entered into ZooKeeper, whatever the user configured gc/monitor/tracer/masters/slaves file with) is best, it is predictable, the current behavior is non-standard in my experience (e.g. that the system would without good reason further process/resolve my input to a config file and write to ZooKeeper the resolved address, now, I am a newb to Accumulo and there very likely might be a very good design reason this is done, in which case, I would happily listen/learn from that and brainstorm other solutions such as a flag/config opt to enable such non-resolution behavior at the cost of whatever the benefit of this very good reason might be, but with my current knowledge I think telling the system to "not touch" the entered network name makes less sense than a flag telling it to do so, again, this is in my Accumulo-ignorant state).
          Hide
          Basit Mustafa added a comment -

          A very perfunctory/basic examination of what's going on at startup has me believing that implementing a basic (non-config flag based optional/conditional behavior) fix would go something like this:

          In org.apache.accumulo.server.tabletserver.TabletServer:

          1) Add ivar to store the hostname String exactly as passed to the config(String hostname) method (from looking at the output of this method's first log statement, it appears it not yet resolved, but as typed in config, this is a good thing.

          2) From here, a few possible paths are possible:

          A. One COULD just say let's modify getClientAddressString() to not return a resolved address. That is assuming this method's contract does not guarantee an IP:PORT String and that all callers are safe using an FQDN or whatever the config file had verbatim. The documentation/comment does not have a specific contract, but the lack of strong typing of the return value to an IP:PORT type (e.g. INetSocketAddress or something) makes me hopeful this would work (although could see this blowing up in all kinds of ways, too, if this String return value is expected to be IP:PORT by callers to getClientAddressString()).

          B. If this doesn't work or we know we don't want to go off changing the nature of this method because it'd violate its unwritten contract/caller expectation that it return IP:PORT, we could go off and say that we'll only write FQDN/hostname as passed verbatim into config(String hostname) (now stored in ivar from #1) in ZooKeeper and keep all Accumulo internals as-is (this works, IMHO since the internals past this point are all in the same JVM as long as we write FQDNs to ZK and we won't have the aforementioned schizophrenia because resolution in the same JVM should be the same barring DNS roundrobining/load balancing [uh, just don't do this between nodes in an Accumulo cluster :)]). Then, we're on the hook to go discover where /accumulo/<instance>/tservers/XXXXXXXXX are read on the client and ensure that that read does the resolution of the retrieved FQDN/string, or at least just runs it through AddressUtils.toString().

          Obviously, B involves the least changes to Accumulo code as it seems pretty straightforward since reads/writes to ZK are pretty obvious/unified in a single set of classes. A is making some large assumptions/leaps about the safety of changing the format of that String output, I'd feel better about it knowing what the author of it (and its callers) intended. I haven't done a "who calls this" analysis to see, I guess I could smoke test it, too, of course. But, B just seems like the path of much less resistance assuming we're only reading the value from ZK in one/a few places.

          Thoughts? Opinions? Anyone have any experience/know the code better than me to help shed light on assumptions or come up with C/D/E/F options that would be better?

          Thanks!

          Show
          Basit Mustafa added a comment - A very perfunctory/basic examination of what's going on at startup has me believing that implementing a basic (non-config flag based optional/conditional behavior) fix would go something like this: In org.apache.accumulo.server.tabletserver.TabletServer: 1) Add ivar to store the hostname String exactly as passed to the config(String hostname) method (from looking at the output of this method's first log statement, it appears it not yet resolved, but as typed in config, this is a good thing. 2) From here, a few possible paths are possible: A. One COULD just say let's modify getClientAddressString() to not return a resolved address. That is assuming this method's contract does not guarantee an IP:PORT String and that all callers are safe using an FQDN or whatever the config file had verbatim. The documentation/comment does not have a specific contract, but the lack of strong typing of the return value to an IP:PORT type (e.g. INetSocketAddress or something) makes me hopeful this would work (although could see this blowing up in all kinds of ways, too, if this String return value is expected to be IP:PORT by callers to getClientAddressString()). B. If this doesn't work or we know we don't want to go off changing the nature of this method because it'd violate its unwritten contract/caller expectation that it return IP:PORT, we could go off and say that we'll only write FQDN/hostname as passed verbatim into config(String hostname) (now stored in ivar from #1) in ZooKeeper and keep all Accumulo internals as-is (this works, IMHO since the internals past this point are all in the same JVM as long as we write FQDNs to ZK and we won't have the aforementioned schizophrenia because resolution in the same JVM should be the same barring DNS roundrobining/load balancing [uh, just don't do this between nodes in an Accumulo cluster :)] ). Then, we're on the hook to go discover where /accumulo/<instance>/tservers/XXXXXXXXX are read on the client and ensure that that read does the resolution of the retrieved FQDN/string, or at least just runs it through AddressUtils.toString(). Obviously, B involves the least changes to Accumulo code as it seems pretty straightforward since reads/writes to ZK are pretty obvious/unified in a single set of classes. A is making some large assumptions/leaps about the safety of changing the format of that String output, I'd feel better about it knowing what the author of it (and its callers) intended. I haven't done a "who calls this" analysis to see, I guess I could smoke test it, too, of course. But, B just seems like the path of much less resistance assuming we're only reading the value from ZK in one/a few places. Thoughts? Opinions? Anyone have any experience/know the code better than me to help shed light on assumptions or come up with C/D/E/F options that would be better? Thanks!
          Hide
          Basit Mustafa added a comment -

          Of course, the above solution only works for the client connection to a tserver case and further assumes the client code only ever needs to resolve/contact tservers, this behavior would have to replicated for any resolved IP writes to ZK for a situation where portions of the intra-Accumulo-cluster (e.g. between nodes) crosses a DNS resolution "schizophrenia" boundary (e.g. between a tracer, master, and tserver) or other services the client would directly lookup and interact with.

          Another thing I'd like to get input from more experienced folk on.

          Show
          Basit Mustafa added a comment - Of course, the above solution only works for the client connection to a tserver case and further assumes the client code only ever needs to resolve/contact tservers, this behavior would have to replicated for any resolved IP writes to ZK for a situation where portions of the intra-Accumulo-cluster (e.g. between nodes) crosses a DNS resolution "schizophrenia" boundary (e.g. between a tracer, master, and tserver) or other services the client would directly lookup and interact with. Another thing I'd like to get input from more experienced folk on.
          Hide
          Eric Newton added a comment -

          I was thinking:

          • first time the tablet server starts, it generates a unique name using zookeeper
          • it scribbles this identity onto the local file system
          • it now uses this identity in zookeeper, but stores its IP address in its lock

          Now your whole cluster can go down, and get randomly assigned IPs and hostnames and everything will work as expected when it starts up.

          Show
          Eric Newton added a comment - I was thinking: first time the tablet server starts, it generates a unique name using zookeeper it scribbles this identity onto the local file system it now uses this identity in zookeeper, but stores its IP address in its lock Now your whole cluster can go down, and get randomly assigned IPs and hostnames and everything will work as expected when it starts up.
          Hide
          Basit Mustafa added a comment -

          I think that is a good solution to the "same instance new IP upon restart" problem, but I don't think it solves the primary reachability problem I describe, where DNS resolution of name "node1.company.com" on the given running the JVM/Accumulo processes might be one thing (read: internal, non-routable IP, for example, in the 10.x.x.x space) and other machines (for example, on other clouds, on different hypervisors, etc) resolve the hostname to the remote machine's public routable IP (e.g. not its 10.x.x.x address, but something routeable).

          Simply always using the public address in config and in lieu of the FQDN doesn't work because 1) the machine does not have a network adapter with the public IP assigned, as this routing is done to its 10.x.x.x/internal address by the hypervisor, so when Accumulo tries to bind to that (the monitor process specifically does this), it cannot find an interface to bind to and errors out 2) traffic sent to/routed to the public address does not take advantage of the 10.x.x.x/internal network's higher speed and zero cost, since it's going out to a Layer 3 routing device to even communicate with local machines that might be on the same segment/hypervisor.

          Hopefully this makes sense. I can draw a diagram, too, if that helps.

          Show
          Basit Mustafa added a comment - I think that is a good solution to the "same instance new IP upon restart" problem, but I don't think it solves the primary reachability problem I describe, where DNS resolution of name "node1.company.com" on the given running the JVM/Accumulo processes might be one thing (read: internal, non-routable IP, for example, in the 10.x.x.x space) and other machines (for example, on other clouds, on different hypervisors, etc) resolve the hostname to the remote machine's public routable IP (e.g. not its 10.x.x.x address, but something routeable). Simply always using the public address in config and in lieu of the FQDN doesn't work because 1) the machine does not have a network adapter with the public IP assigned, as this routing is done to its 10.x.x.x/internal address by the hypervisor, so when Accumulo tries to bind to that (the monitor process specifically does this), it cannot find an interface to bind to and errors out 2) traffic sent to/routed to the public address does not take advantage of the 10.x.x.x/internal network's higher speed and zero cost, since it's going out to a Layer 3 routing device to even communicate with local machines that might be on the same segment/hypervisor. Hopefully this makes sense. I can draw a diagram, too, if that helps.
          Hide
          Keith Turner added a comment -

          Eric Newton what problem does having a unique ID in the local filesystem per tablet server solve?

          Show
          Keith Turner added a comment - Eric Newton what problem does having a unique ID in the local filesystem per tablet server solve?
          Hide
          Basit Mustafa added a comment -

          Keith, I think it solves the (secondary + somewhat related) problem that was raised in the discussion thread I referenced above about IP changing on machine restart (also quite common on virtualized machines, almost guaranteed on IaaS providers such as EC2). But, it does not solve the reachability issue that (at least for me, and in creating this issue, is the primary problem/goal to fix).

          Show
          Basit Mustafa added a comment - Keith, I think it solves the (secondary + somewhat related) problem that was raised in the discussion thread I referenced above about IP changing on machine restart (also quite common on virtualized machines, almost guaranteed on IaaS providers such as EC2). But, it does not solve the reachability issue that (at least for me, and in creating this issue, is the primary problem/goal to fix).
          Hide
          Keith Turner added a comment -

          I think it solves the (secondary + somewhat related) problem that was raised in the discussion thread I referenced above about IP changing on machine restart

          I think Accumulo should already handle this. The lock related to the old tserver instance in zookeeper, should expire. The master will see this and reassign tablets.

          Show
          Keith Turner added a comment - I think it solves the (secondary + somewhat related) problem that was raised in the discussion thread I referenced above about IP changing on machine restart I think Accumulo should already handle this. The lock related to the old tserver instance in zookeeper, should expire. The master will see this and reassign tablets.
          Hide
          Basit Mustafa added a comment -

          Ah, ok, I did not know this, but absolutely makes sense.

          So, really, this issue is simply about the original reachability issue when hostnames resolve to different values based on where in the network resolution is done.

          Show
          Basit Mustafa added a comment - Ah, ok, I did not know this, but absolutely makes sense. So, really, this issue is simply about the original reachability issue when hostnames resolve to different values based on where in the network resolution is done.
          Hide
          Keith Turner added a comment -

          So, really, this issue is simply about the original reachability issue when hostnames resolve to different values based on where in the network resolution is done.

          Yes, for 1.5. In 1.4 the location of a write ahead log is the ip address+port+log name. This location is stored in the metadata table. If the IP address or port of a logger changes, then the walogs on that logger can not be found. See ACCUMULO-544. In 1.5 write ahead logs are stored in hdfs.

          Show
          Keith Turner added a comment - So, really, this issue is simply about the original reachability issue when hostnames resolve to different values based on where in the network resolution is done. Yes, for 1.5. In 1.4 the location of a write ahead log is the ip address+port+log name. This location is stored in the metadata table. If the IP address or port of a logger changes, then the walogs on that logger can not be found. See ACCUMULO-544 . In 1.5 write ahead logs are stored in hdfs.
          Hide
          ASF subversion and git services added a comment -

          Commit 10b44e79544b5f16cd747de7926af23739bf5726 in branch refs/heads/master from Eric Newton
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=10b44e7 ]

          ACCUMULO-1585 track tablet servers by their entry in zookeeper, not by
          their resolved address
          ACCUMULO-1601 make interface hinting consistent across all servers

          Show
          ASF subversion and git services added a comment - Commit 10b44e79544b5f16cd747de7926af23739bf5726 in branch refs/heads/master from Eric Newton [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=10b44e7 ] ACCUMULO-1585 track tablet servers by their entry in zookeeper, not by their resolved address ACCUMULO-1601 make interface hinting consistent across all servers
          Hide
          ASF subversion and git services added a comment -

          Commit 3ed42c231927c467a8536ff0c4e094d652588a9c in branch refs/heads/master from Eric Newton
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=3ed42c2 ]

          ACCUMULO-1585 remove AddressUtil from server

          Show
          ASF subversion and git services added a comment - Commit 3ed42c231927c467a8536ff0c4e094d652588a9c in branch refs/heads/master from Eric Newton [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=3ed42c2 ] ACCUMULO-1585 remove AddressUtil from server
          Hide
          Eric Newton added a comment -

          Basit Mustafa can you test 1.6-SNAPSHOT (in branch master) and see if it works for you?

          Show
          Eric Newton added a comment - Basit Mustafa can you test 1.6-SNAPSHOT (in branch master) and see if it works for you?
          Hide
          ASF subversion and git services added a comment -

          Commit 689414e28500f187a0d31266deca0975a09a91b0 in branch refs/heads/master from Eric Newton
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=689414e ]

          ACCUMULO-1585 fix tracer registration in zookeeper

          Show
          ASF subversion and git services added a comment - Commit 689414e28500f187a0d31266deca0975a09a91b0 in branch refs/heads/master from Eric Newton [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=689414e ] ACCUMULO-1585 fix tracer registration in zookeeper
          Hide
          ASF subversion and git services added a comment -

          Commit 81684b769061339a144ad09b8f5a30251d46b8fa in branch refs/heads/master from Eric Newton
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=81684b7 ]

          ACCUMULO-1585 forgot to add new file

          Show
          ASF subversion and git services added a comment - Commit 81684b769061339a144ad09b8f5a30251d46b8fa in branch refs/heads/master from Eric Newton [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=81684b7 ] ACCUMULO-1585 forgot to add new file
          Hide
          Basit Mustafa added a comment -

          Wow! Thank you so much, Eric. I feel bad for not being able to help more with actually doing this, and really appreciate your quick work so far. Pulling the 1.6-SNAPSHOT today and will test!

          Show
          Basit Mustafa added a comment - Wow! Thank you so much, Eric. I feel bad for not being able to help more with actually doing this, and really appreciate your quick work so far. Pulling the 1.6-SNAPSHOT today and will test!
          Hide
          Basit Mustafa added a comment -

          Eric, seems like everything is working quite well (though I have not yet run a full test against it, but have brought up a small accumulo cluster on EC2 and the zk entries seem right and can connect to it from the outside world where DNS schizophrenia is taking place...).

          I will run heavier tests later this weekend and report back, but everything preliminarily looks very good.

          Thank you again, very much, I really appreciate it and hope I can become more active as I learn more about the internals of Accumulo. I'll report back in a couple days.

          Show
          Basit Mustafa added a comment - Eric, seems like everything is working quite well (though I have not yet run a full test against it, but have brought up a small accumulo cluster on EC2 and the zk entries seem right and can connect to it from the outside world where DNS schizophrenia is taking place...). I will run heavier tests later this weekend and report back, but everything preliminarily looks very good. Thank you again, very much, I really appreciate it and hope I can become more active as I learn more about the internals of Accumulo. I'll report back in a couple days.
          Hide
          ASF subversion and git services added a comment -

          Commit 9d50657d90a4450604b7c64ba1870cfc0f8b1a3a in branch refs/heads/master from Eric Newton
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=9d50657 ]

          ACCUMULO-1585 do our best to guess our address after the bind

          Show
          ASF subversion and git services added a comment - Commit 9d50657d90a4450604b7c64ba1870cfc0f8b1a3a in branch refs/heads/master from Eric Newton [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=9d50657 ] ACCUMULO-1585 do our best to guess our address after the bind
          Hide
          Mike Drob added a comment -

          Basit Mustafa, did you have a chance to do heavier testing on this?

          Show
          Mike Drob added a comment - Basit Mustafa , did you have a chance to do heavier testing on this?
          Hide
          Keith Turner added a comment -

          While working on ACCUMULO-1152 I ran into an issue with this. I was trying to check the tserver lock in the tablet location cache, but was unable to because the metadata table had IP addrs and zookeeper contained hostnames. Seems like the locations in the metadata table should exactly match what is advertised in zookeeper.

          Show
          Keith Turner added a comment - While working on ACCUMULO-1152 I ran into an issue with this. I was trying to check the tserver lock in the tablet location cache, but was unable to because the metadata table had IP addrs and zookeeper contained hostnames. Seems like the locations in the metadata table should exactly match what is advertised in zookeeper.
          Hide
          Keith Turner added a comment -

          Looking at the code InetSocketAddress is used and the way its used goes from hostname to IP to hostname, which seems like it could cause problems. I think we should take exactly whats in zookeeper and put it in the metadata table w/o any DNS or reverse DNS lookups. I am thinking about creating a simple object that represents hostname+port (w/o any DNS lookups) and using that for all metadata table location reads and writes. Then have this object move outwards in the source code from the places where its used w/ the metadata table.

          Show
          Keith Turner added a comment - Looking at the code InetSocketAddress is used and the way its used goes from hostname to IP to hostname, which seems like it could cause problems. I think we should take exactly whats in zookeeper and put it in the metadata table w/o any DNS or reverse DNS lookups. I am thinking about creating a simple object that represents hostname+port (w/o any DNS lookups) and using that for all metadata table location reads and writes. Then have this object move outwards in the source code from the places where its used w/ the metadata table.
          Hide
          ASF subversion and git services added a comment -

          Commit 3de0c1ec44d476fe1465eb05d33c32de3ccf1068 in branch refs/heads/master from [~keith_turner]
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=3de0c1e ]

          ACCUMULO-1585 Replaced most InetSocketAddress usage with HostAndPort to avoid DNS and reverse DNS lookups

          Show
          ASF subversion and git services added a comment - Commit 3de0c1ec44d476fe1465eb05d33c32de3ccf1068 in branch refs/heads/master from [~keith_turner] [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=3de0c1e ] ACCUMULO-1585 Replaced most InetSocketAddress usage with HostAndPort to avoid DNS and reverse DNS lookups
          Hide
          ASF subversion and git services added a comment -

          Commit 40e167ce42c194e707a6b0941248704835ddb071 in branch refs/heads/master from Eric Newton
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=40e167c ]

          ACCUMULO-1585 sometimes addresses are advertized as host+port

          Show
          ASF subversion and git services added a comment - Commit 40e167ce42c194e707a6b0941248704835ddb071 in branch refs/heads/master from Eric Newton [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=40e167c ] ACCUMULO-1585 sometimes addresses are advertized as host+port
          Hide
          ASF subversion and git services added a comment -

          Commit e79adf2ed1c0b644963381f58bb1b0e7f750b845 in branch refs/heads/master from Eric Newton
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=e79adf2 ]

          ACCUMULO-1585 fix typo

          Show
          ASF subversion and git services added a comment - Commit e79adf2ed1c0b644963381f58bb1b0e7f750b845 in branch refs/heads/master from Eric Newton [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=e79adf2 ] ACCUMULO-1585 fix typo
          Hide
          Eric Newton added a comment -

          Closing. Open tickets for any remaining bugs.

          Show
          Eric Newton added a comment - Closing. Open tickets for any remaining bugs.
          Hide
          ASF subversion and git services added a comment -

          Commit 839d689f67a53e1b34e01d69a989232d289c8bf8 in branch refs/heads/master from [~keith_turner]
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=839d689 ]

          ACCUMULO-1585 add guava to tool.sh

          Show
          ASF subversion and git services added a comment - Commit 839d689f67a53e1b34e01d69a989232d289c8bf8 in branch refs/heads/master from [~keith_turner] [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=839d689 ] ACCUMULO-1585 add guava to tool.sh

            People

            • Assignee:
              Eric Newton
              Reporter:
              Basit Mustafa
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 12h
                12h
                Remaining:
                Remaining Estimate - 12h
                12h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development