Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-19147 All branch-2 unit tests pass
  3. HBASE-19220

Async tests time out talking to zk; 'clusterid came back null'

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0-beta-1, 2.0.0
    • test
    • None
    • Changed retries from 3 to 30 for zk initial connect for registry.

    Description

      I see this in test runs on a dedicated machine:

      [ERROR] Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 652.514 s <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncAdminBuilder
      [ERROR] testRpcTimeout[0](org.apache.hadoop.hbase.client.TestAsyncAdminBuilder) Time elapsed: 213.618 s <<< ERROR!
      java.util.concurrent.ExecutionException: java.io.IOException: clusterid came back null
      at org.apache.hadoop.hbase.client.TestAsyncAdminBuilder.testRpcTimeout(TestAsyncAdminBuilder.java:105)
      Caused by: java.io.IOException: clusterid came back null

      [ERROR] org.apache.hadoop.hbase.client.TestAsyncTableScanMetrics Time elapsed: 0.007 s <<< ERROR!
      java.util.concurrent.ExecutionException: java.io.IOException: clusterid came back null
      at org.apache.hadoop.hbase.client.TestAsyncTableScanMetrics.setUp(TestAsyncTableScanMetrics.java:97)
      Caused by: java.io.IOException: clusterid came back null

      [ERROR] org.apache.hadoop.hbase.client.TestRawAsyncScanCursor Time elapsed: 0.005 s <<< ERROR!
      java.util.concurrent.ExecutionException: java.io.IOException: clusterid came back null
      at org.apache.hadoop.hbase.client.TestRawAsyncScanCursor.setUpBeforeClass(TestRawAsyncScanCursor.java:42)
      Caused by: java.io.IOException: clusterid came back null

      [ERROR] org.apache.hadoop.hbase.client.TestAsyncNamespaceAdminApi Time elapsed: 0.005 s <<< ERROR!
      java.util.concurrent.ExecutionException: java.io.IOException: clusterid came back null
      at org.apache.hadoop.hbase.client.TestAsyncNamespaceAdminApi.setUpBeforeClass(TestAsyncNamespaceAdminApi.java:66)
      Caused by: java.io.IOException: clusterid came back null

      If I up the retries, they go away.

      At least on this machine, I notice that zk connections can take a while... see HBASE-19102 where we add a wait on the Connection to come up before progressing.

      Suggest that I up the retries. No harm in trying more. It is currently set to 3 retries at a one second interval.

      Attachments

        1. 19220.patch
          1.0 kB
          Michael Stack

        Activity

          People

            stack Michael Stack
            stack Michael Stack
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: