Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7594

TestAutomaticCatalogInvalidation.test_local_catalog and test_v1_catalog still hitting timeout

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • Impala 3.1.0
    • None
    • Infrastructure

    Description

      Similar to IMPALA-7580, I hit this on a build of one of my patches when running under ASAN. The branch had the fix for IMPALA-7580 in it (it's based off 038af345933fde4fbcc9bc524f4ca93bfc08c633):

      custom_cluster.test_automatic_invalidation.TestAutomaticCatalogInvalidation.test_local_catalog (from pytest)
      
      Failing for the past 2 builds (Since Failed#3242 )
      Took 30 sec.
      add description
      Error Message
      assert 1537339209.989275 < 1537339209.65928  +  where 1537339209.989275 = <built-in function time>()  +    where <built-in function time> = time.time
      Stacktrace
      custom_cluster/test_automatic_invalidation.py:70: in test_local_catalog
          self._run_test(cursor)
      custom_cluster/test_automatic_invalidation.py:58: in _run_test
          assert time.time() < timeout
      E   assert 1537339209.989275 < 1537339209.65928
      E    +  where 1537339209.989275 = <built-in function time>()
      E    +    where <built-in function time> = time.time
      Standard Error
      -- 2018-09-18 23:39:39,498 INFO     MainThread: Starting cluster with command: /data/jenkins/workspace/impala-private-parameterized/repos/Impala/bin/start-impala-cluster.py --cluster_size=3 --num_coordinators=3 --log_dir=/data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests --log_level=1 '--impalad_args="--invalidate_tables_timeout_s=20 --use_local_catalog" ' '--state_store_args="--statestore_update_frequency_ms=50     --statestore_priority_update_frequency_ms=50     --statestore_heartbeat_frequency_ms=50" ' '--catalogd_args="--invalidate_tables_timeout_s=20 --catalog_topic_mode=minimal" '
      23:39:40 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/statestored.INFO
      23:39:41 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
      23:39:42 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad.INFO
      23:39:43 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
      23:39:44 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
      23:39:47 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      23:39:47 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000
      23:39:47 MainThread: Waiting for num_known_live_backends=3. Current value: 2
      23:39:48 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000
      23:39:48 MainThread: num_known_live_backends has reached value: 3
      23:39:48 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25001
      23:39:48 MainThread: num_known_live_backends has reached value: 3
      23:39:48 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25002
      23:39:48 MainThread: num_known_live_backends has reached value: 3
      23:39:48 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 executors).
      -- 2018-09-18 23:39:48,471 INFO     MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      -- 2018-09-18 23:39:48,471 INFO     MainThread: Getting metric: statestore.live-backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25010
      -- 2018-09-18 23:39:48,472 INFO     MainThread: Metric 'statestore.live-backends' has reached desired value: 4
      -- 2018-09-18 23:39:48,472 INFO     MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000
      -- 2018-09-18 23:39:48,473 INFO     MainThread: num_known_live_backends has reached value: 3
      -- 2018-09-18 23:39:48,473 INFO     MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25001
      -- 2018-09-18 23:39:48,474 INFO     MainThread: num_known_live_backends has reached value: 3
      -- 2018-09-18 23:39:48,474 INFO     MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25002
      -- 2018-09-18 23:39:48,475 INFO     MainThread: num_known_live_backends has reached value: 3
      -- connecting to: localhost:21000
      -- 2018-09-18 23:39:48,475 ERROR    MainThread: Failed to open transport (tries_left=3)
      Traceback (most recent call last):
        File "/data/jenkins/workspace/impala-private-parameterized/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py", line 940, in _execute
          return func(request)
        File "/data/jenkins/workspace/impala-private-parameterized/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", line 175, in OpenSession
          return self.recv_OpenSession()
        File "/data/jenkins/workspace/impala-private-parameterized/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.py", line 186, in recv_OpenSession
          (fname, mtype, rseqid) = self._iprot.readMessageBegin()
        File "/data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 126, in readMessageBegin
          sz = self.readI32()
        File "/data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 206, in readI32
          buff = self.trans.readAll(4)
        File "/data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py", line 58, in readAll
          chunk = self.read(sz - have)
        File "/data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TTransport.py", line 159, in read
          self.__rbuf = StringIO(self.__trans.read(max(sz, self.__rbuf_size)))
        File "/data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSocket.py", line 120, in read
          message='TSocket read 0 bytes')
      TTransportException: TSocket read 0 bytes
      

      test_v1_catalog is also hitting a similar failure:

      custom_cluster.test_automatic_invalidation.TestAutomaticCatalogInvalidation.test_v1_catalog (from pytest)
      
      Failing for the past 2 builds (Since Failed#3242 )
      Took 31 sec.
      add description
      Error Message
      assert 1537339179.482299 < 1537339179.144369  +  where 1537339179.482299 = <built-in function time>()  +    where <built-in function time> = time.time
      Stacktrace
      custom_cluster/test_automatic_invalidation.py:63: in test_v1_catalog
          self._run_test(cursor)
      custom_cluster/test_automatic_invalidation.py:58: in _run_test
          assert time.time() < timeout
      E   assert 1537339179.482299 < 1537339179.144369
      E    +  where 1537339179.482299 = <built-in function time>()
      E    +    where <built-in function time> = time.time
      Standard Error
      -- 2018-09-18 23:39:07,901 INFO     MainThread: Starting cluster with command: /data/jenkins/workspace/impala-private-parameterized/repos/Impala/bin/start-impala-cluster.py --cluster_size=3 --num_coordinators=3 --log_dir=/data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests --log_level=1 '--impalad_args="--invalidate_tables_timeout_s=20" ' '--state_store_args="--statestore_update_frequency_ms=50     --statestore_priority_update_frequency_ms=50     --statestore_heartbeat_frequency_ms=50" ' '--catalogd_args="--invalidate_tables_timeout_s=20" '
      23:39:08 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/statestored.INFO
      23:39:09 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
      23:39:10 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad.INFO
      23:39:11 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
      23:39:12 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
      23:39:15 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      23:39:15 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000
      23:39:15 MainThread: Waiting for num_known_live_backends=3. Current value: 1
      23:39:16 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000
      23:39:16 MainThread: Waiting for num_known_live_backends=3. Current value: 2
      23:39:17 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000
      23:39:17 MainThread: num_known_live_backends has reached value: 3
      23:39:17 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25001
      23:39:17 MainThread: num_known_live_backends has reached value: 3
      23:39:17 MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25002
      23:39:17 MainThread: num_known_live_backends has reached value: 3
      23:39:17 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 executors).
      -- 2018-09-18 23:39:18,000 INFO     MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      -- 2018-09-18 23:39:18,001 INFO     MainThread: Getting metric: statestore.live-backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25010
      -- 2018-09-18 23:39:18,002 INFO     MainThread: Metric 'statestore.live-backends' has reached desired value: 4
      -- 2018-09-18 23:39:18,002 INFO     MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000
      -- 2018-09-18 23:39:18,003 INFO     MainThread: num_known_live_backends has reached value: 3
      -- 2018-09-18 23:39:18,003 INFO     MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25001
      -- 2018-09-18 23:39:18,003 INFO     MainThread: num_known_live_backends has reached value: 3
      -- 2018-09-18 23:39:18,003 INFO     MainThread: Getting num_known_live_backends from impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25002
      -- 2018-09-18 23:39:18,004 INFO     MainThread: num_known_live_backends has reached value: 3
      -- connecting to: localhost:21000
      

      Attachments

        Issue Links

          Activity

            People

              tianyiwang Tianyi Wang
              tarmstrong Tim Armstrong
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: