Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-13350

Workload Management flush on interval test failed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • Impala 4.5.0
    • Backend
    • None
    • ghx-label-2

    Description

      After IMPALA-12737 "Refactor the Workload Management Initialization Process", custom_cluster/test_query_log.py::TestQueryLogTableHS2::test_flush_on_interval failed with

      -- 2024-09-03 11:50:35,147 INFO     MainThread: Starting cluster with command: /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/bin/start-impala-cluster.py '--state_store_args=--statestore_update_frequency_ms=50     --statestore_priority_update_frequency_ms=50     --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 --log_dir=/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests --log_level=1 '--impalad=--shutdown_grace_period_s=0 --shutdown_deadline_s=15' '--impalad_args=--enable_workload_mgmt --query_log_write_interval_s=15 ' '--state_store_args=None ' '--catalogd_args=--enable_workload_mgmt ' --impalad_args=--default_query_options=
      11:50:35 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
      11:50:35 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/statestored.INFO
      11:50:35 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
      11:50:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.INFO
      11:50:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
      11:50:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
      11:50:38 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:38 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:38 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      11:50:38 MainThread: Debug webpage not yet available: HTTPConnectionPool(host='impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com', port=25000): Max retries exceeded with url: /backends?json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f99f9704d10>: Failed to establish a new connection: [Errno 111] Connection refused',))
      11:50:40 MainThread: Debug webpage did not become available in expected time.
      11:50:40 MainThread: Waiting for num_known_live_backends=3. Current value: None
      11:50:41 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:41 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      11:50:41 MainThread: Debug webpage not yet available: HTTPConnectionPool(host='impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com', port=25000): Max retries exceeded with url: /backends?json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f99f97037d0>: Failed to establish a new connection: [Errno 111] Connection refused',))
      11:50:43 MainThread: Debug webpage did not become available in expected time.
      11:50:43 MainThread: Waiting for num_known_live_backends=3. Current value: None
      11:50:44 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:44 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      11:50:44 MainThread: num_known_live_backends has reached value: 3
      11:50:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:45 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25001
      11:50:45 MainThread: num_known_live_backends has reached value: 3
      11:50:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:45 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25002
      11:50:45 MainThread: num_known_live_backends has reached value: 3
      11:50:46 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 executors).
      -- 2024-09-03 11:50:46,553 DEBUG    MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      -- 2024-09-03 11:50:46,553 INFO     MainThread: Getting metric: statestore.live-backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25010
      -- 2024-09-03 11:50:46,556 INFO     MainThread: Metric 'statestore.live-backends' has reached desired value: 4
      -- 2024-09-03 11:50:46,556 DEBUG    MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:50:46,558 INFO     MainThread: num_known_live_backends has reached value: 3
      -- 2024-09-03 11:50:46,558 DEBUG    MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25001
      -- 2024-09-03 11:50:46,559 INFO     MainThread: num_known_live_backends has reached value: 3
      -- 2024-09-03 11:50:46,559 DEBUG    MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25002
      -- 2024-09-03 11:50:46,561 INFO     MainThread: num_known_live_backends has reached value: 3
      SET client_identifier=custom_cluster/test_query_log.py::TestQueryLogTableHS2::()::test_flush_on_interval[protocol:hs2|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_ro;
      -- connecting to: localhost:21000
      -- 2024-09-03 11:50:46,561 INFO     MainThread: Could not connect to ('::1', 21000, 0, 0)
      Traceback (most recent call last):
        File "/data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 137, in open
          handle.connect(sockaddr)
        File "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", line 228, in meth
          return getattr(self._sock,name)(*args)
      error: [Errno 111] Connection refused
      -- connecting to localhost:21050 with impyla
      -- 2024-09-03 11:50:46,562 INFO     MainThread: Could not connect to ('::1', 21050, 0, 0)
      Traceback (most recent call last):
        File "/data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 137, in open
          handle.connect(sockaddr)
        File "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", line 228, in meth
          return getattr(self._sock,name)(*args)
      error: [Errno 111] Connection refused
      -- 2024-09-03 11:50:46,638 INFO     MainThread: Closing active operation
      -- connecting to localhost:28000 with impyla
      -- 2024-09-03 11:50:46,661 INFO     MainThread: Closing active operation
      -- 2024-09-03 11:50:46,663 INFO     MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: 
                  }
      -- 2024-09-03 11:50:47,665 INFO     MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: 
      I0903 11:50:46.911687 14130 workload-man
      -- 2024-09-03 11:50:48,668 INFO     MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: 
      I0903 11:50:46.911687 14130 workload-man
      -- 2024-09-03 11:50:49,670 INFO     MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: 
      I0903 11:50:46.911687 14130 workload-man
      -- 2024-09-03 11:50:50,673 INFO     MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: 
      I0903 11:50:46.911687 14130 workload-man
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074f10>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074f10>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074f10>
      -- 2024-09-03 11:50:52,884 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074810>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074810>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074810>
      -- 2024-09-03 11:50:53,898 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- 2024-09-03 11:50:54,913 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- 2024-09-03 11:50:55,927 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- 2024-09-03 11:50:56,941 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- 2024-09-03 11:50:57,956 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- 2024-09-03 11:50:58,970 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- 2024-09-03 11:50:59,984 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- 2024-09-03 11:51:00,998 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- 2024-09-03 11:51:02,011 INFO     MainThread: Closing active operation
      -- 2024-09-03 11:51:02,012 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:02,015 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 0
      -- 2024-09-03 11:51:02,015 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:03,016 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:03,029 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 0
      -- 2024-09-03 11:51:03,029 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:04,030 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:04,033 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:04,033 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:05,034 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:05,037 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:05,037 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:06,038 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:06,041 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:06,041 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:07,042 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:07,045 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:07,045 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:08,046 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:08,049 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:08,049 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:09,050 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:09,054 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:09,054 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:10,055 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:10,058 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:10,058 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:11,059 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:11,062 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:11,062 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:12,063 INFO     MainThread: Metric impala-server.completed-queries.written did not reach value 10 in 10s. Failing...

      Logs attached.

      Attachments

        1. failure.tgz
          244 kB
          Michael Smith

        Issue Links

          Activity

            People

              jasonmfehr Jason Fehr
              MikaelSmith Michael Smith
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: