Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-12554

TestBaseLoadBalancer may timeout due to lengthy rack lookup

    XMLWordPrintableJSON

    Details

    • Type: Test
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.99.2
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Here is one of the recent occurrences (https://builds.apache.org/job/PreCommit-HBASE-Build/11778/console):

      testImmediateAssignment(org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer)  Time elapsed: 30.019 sec  <<< ERROR!
      java.lang.Exception: test timed out after 30000 milliseconds
      	at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
      	at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
      	at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
      	at java.net.InetAddress.getAllByName0(InetAddress.java:1246)
      	at java.net.InetAddress.getAllByName(InetAddress.java:1162)
      	at java.net.InetAddress.getAllByName(InetAddress.java:1098)
      	at java.net.InetAddress.getByName(InetAddress.java:1048)
      	at org.apache.hadoop.net.NetUtils.normalizeHostName(NetUtils.java:561)
      	at org.apache.hadoop.net.NetUtils.normalizeHostNames(NetUtils.java:578)
      	at org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:109)
      	at org.apache.hadoop.hbase.master.RackManager.getRack(RackManager.java:66)
      	at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.<init>(BaseLoadBalancer.java:273)
      	at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:1113)
      	at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.randomAssignment(BaseLoadBalancer.java:1175)
      	at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.immediateAssignment(BaseLoadBalancer.java:1145)
      	at org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer.testImmediateAssignment(TestBaseLoadBalancer.java:136)
      

      One possible fix is to submit CachedDNSToSwitchMapping.resolve() to executor pool for execution. RackManager.getRack() can set a timeout beyond which UNKNOWN_RACK is returned.

        Attachments

        1. 12554-v1.txt
          3 kB
          Ted Yu
        2. 12554-v2.txt
          5 kB
          Ted Yu
        3. 12554-v3.txt
          7 kB
          Ted Yu
        4. 12554-v4.txt
          9 kB
          Ted Yu
        5. 12554-v5.txt
          2 kB
          Ted Yu

          Activity

            People

            • Assignee:
              yuzhihong@gmail.com Ted Yu
              Reporter:
              yuzhihong@gmail.com Ted Yu
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: