Whirr
  1. Whirr
  2. WHIRR-459

DNS Failure when trying to spawn HBase cluster

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.0
    • Fix Version/s: 0.7.2
    • Component/s: None
    • Labels:
      None
    • Environment:

      Trying to use WHirr from behind a NAT

      Description

      While trying to launch an HBase cluster from a system which runs behind a NAT I get the following Exception. The cluster is spawned and then it gets destroyed. The same when run from another EC2 instance runs fine.

      bin/whirr launch-cluster --config hbase-ec2.properties
      Bootstrapping cluster
      Configuring template
      Configuring template
      Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, hbase-master]
      Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, hbase-regionserver]
      Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata=

      {Name=hbase-5890203a}

      , tags=[]]]
      Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata=

      {Name=hbase-54902036}

      , tags=[]], [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sde, durable=false, isBootDevice=false]], supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), tags=[]], loginUser=ubuntu, userMetadata=

      {Name=hbase-5a902038}

      , tags=[]]]
      Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for [122.172.0.45/32]
      Unable to start the cluster. Terminating all nodes.
      org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
      at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
      at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
      at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
      at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
      at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
      at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
      at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
      at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
      at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
      at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
      at org.apache.whirr.cli.Main.run(Main.java:64)
      at org.apache.whirr.cli.Main.main(Main.java:97)
      Caused by: java.net.ConnectException: Connection refused
      at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
      at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
      at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
      at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
      at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
      at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
      at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
      ... 11 more
      Unable to load cluster state, assuming it has no running nodes.
      java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such file or directory)
      at java.io.FileInputStream.open(Native Method)
      at java.io.FileInputStream.<init>(FileInputStream.java:137)
      at com.google.common.io.Files$1.getInput(Files.java:100)
      at com.google.common.io.Files$1.getInput(Files.java:97)
      at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
      at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
      at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
      at com.google.common.io.Files.readLines(Files.java:580)
      at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
      at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
      at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
      at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
      at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
      at org.apache.whirr.cli.Main.run(Main.java:64)
      at org.apache.whirr.cli.Main.main(Main.java:97)
      Starting to run scripts on cluster for phase destroyinstances:
      Starting to run scripts on cluster for phase destroyinstances:
      Finished running destroy phase scripts on all cluster instances
      Destroying hbase cluster
      Cluster hbase destroyed
      Exception in thread "main" java.lang.RuntimeException: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
      at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125)
      at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
      at org.apache.whirr.cli.Main.run(Main.java:64)
      at org.apache.whirr.cli.Main.main(Main.java:97)
      Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: Connection refused
      at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83)
      at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40)
      at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112)
      at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94)
      at org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58)
      at org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86)
      at org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53)
      at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
      at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109)
      ... 3 more
      Caused by: java.net.ConnectException: Connection refused
      at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
      at org.xbill.DNS.TCPClient.connect(TCPClient.java:30)
      at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118)
      at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254)
      at org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95)
      at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358)
      at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69)
      ... 11 more

      1. WHIRR-459.patch
        0.8 kB
        Paolo Castagna
      2. WHIRR-459-fallback-to-jclouds-hostname.patch
        6 kB
        Alex Heneveld

        Issue Links

          Activity

            People

            • Assignee:
              Alex Heneveld
              Reporter:
              Akash Ashok
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development