Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-13350

Workload Management flush on interval test failed

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • Impala 4.5.0
    • Backend
    • None
    • ghx-label-2

    Description

      After IMPALA-12737 "Refactor the Workload Management Initialization Process", custom_cluster/test_query_log.py::TestQueryLogTableHS2::test_flush_on_interval failed with

      -- 2024-09-03 11:50:35,147 INFO     MainThread: Starting cluster with command: /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/bin/start-impala-cluster.py '--state_store_args=--statestore_update_frequency_ms=50     --statestore_priority_update_frequency_ms=50     --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 --log_dir=/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests --log_level=1 '--impalad=--shutdown_grace_period_s=0 --shutdown_deadline_s=15' '--impalad_args=--enable_workload_mgmt --query_log_write_interval_s=15 ' '--state_store_args=None ' '--catalogd_args=--enable_workload_mgmt ' --impalad_args=--default_query_options=
      11:50:35 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
      11:50:35 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/statestored.INFO
      11:50:35 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
      11:50:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.INFO
      11:50:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
      11:50:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
      11:50:38 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:38 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:38 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      11:50:38 MainThread: Debug webpage not yet available: HTTPConnectionPool(host='impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com', port=25000): Max retries exceeded with url: /backends?json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f99f9704d10>: Failed to establish a new connection: [Errno 111] Connection refused',))
      11:50:40 MainThread: Debug webpage did not become available in expected time.
      11:50:40 MainThread: Waiting for num_known_live_backends=3. Current value: None
      11:50:41 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:41 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      11:50:41 MainThread: Debug webpage not yet available: HTTPConnectionPool(host='impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com', port=25000): Max retries exceeded with url: /backends?json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f99f97037d0>: Failed to establish a new connection: [Errno 111] Connection refused',))
      11:50:43 MainThread: Debug webpage did not become available in expected time.
      11:50:43 MainThread: Waiting for num_known_live_backends=3. Current value: None
      11:50:44 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:44 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      11:50:44 MainThread: num_known_live_backends has reached value: 3
      11:50:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:45 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25001
      11:50:45 MainThread: num_known_live_backends has reached value: 3
      11:50:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      11:50:45 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25002
      11:50:45 MainThread: num_known_live_backends has reached value: 3
      11:50:46 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 executors).
      -- 2024-09-03 11:50:46,553 DEBUG    MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      -- 2024-09-03 11:50:46,553 INFO     MainThread: Getting metric: statestore.live-backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25010
      -- 2024-09-03 11:50:46,556 INFO     MainThread: Metric 'statestore.live-backends' has reached desired value: 4
      -- 2024-09-03 11:50:46,556 DEBUG    MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:50:46,558 INFO     MainThread: num_known_live_backends has reached value: 3
      -- 2024-09-03 11:50:46,558 DEBUG    MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25001
      -- 2024-09-03 11:50:46,559 INFO     MainThread: num_known_live_backends has reached value: 3
      -- 2024-09-03 11:50:46,559 DEBUG    MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25002
      -- 2024-09-03 11:50:46,561 INFO     MainThread: num_known_live_backends has reached value: 3
      SET client_identifier=custom_cluster/test_query_log.py::TestQueryLogTableHS2::()::test_flush_on_interval[protocol:hs2|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_ro;
      -- connecting to: localhost:21000
      -- 2024-09-03 11:50:46,561 INFO     MainThread: Could not connect to ('::1', 21000, 0, 0)
      Traceback (most recent call last):
        File "/data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 137, in open
          handle.connect(sockaddr)
        File "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", line 228, in meth
          return getattr(self._sock,name)(*args)
      error: [Errno 111] Connection refused
      -- connecting to localhost:21050 with impyla
      -- 2024-09-03 11:50:46,562 INFO     MainThread: Could not connect to ('::1', 21050, 0, 0)
      Traceback (most recent call last):
        File "/data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 137, in open
          handle.connect(sockaddr)
        File "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", line 228, in meth
          return getattr(self._sock,name)(*args)
      error: [Errno 111] Connection refused
      -- 2024-09-03 11:50:46,638 INFO     MainThread: Closing active operation
      -- connecting to localhost:28000 with impyla
      -- 2024-09-03 11:50:46,661 INFO     MainThread: Closing active operation
      -- 2024-09-03 11:50:46,663 INFO     MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: 
                  }
      -- 2024-09-03 11:50:47,665 INFO     MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: 
      I0903 11:50:46.911687 14130 workload-man
      -- 2024-09-03 11:50:48,668 INFO     MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: 
      I0903 11:50:46.911687 14130 workload-man
      -- 2024-09-03 11:50:49,670 INFO     MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: 
      I0903 11:50:46.911687 14130 workload-man
      -- 2024-09-03 11:50:50,673 INFO     MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: 
      I0903 11:50:46.911687 14130 workload-man
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074f10>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074f10>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074f10>
      -- 2024-09-03 11:50:52,884 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074810>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074810>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074810>
      -- 2024-09-03 11:50:53,898 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- 2024-09-03 11:50:54,913 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- 2024-09-03 11:50:55,927 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- 2024-09-03 11:50:56,941 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- 2024-09-03 11:50:57,956 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- 2024-09-03 11:50:58,970 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0>
      -- 2024-09-03 11:50:59,984 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0>
      -- 2024-09-03 11:51:00,998 INFO     MainThread: Closing active operation
      -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210>
      -- 2024-09-03 11:51:02,011 INFO     MainThread: Closing active operation
      -- 2024-09-03 11:51:02,012 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:02,015 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 0
      -- 2024-09-03 11:51:02,015 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:03,016 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:03,029 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 0
      -- 2024-09-03 11:51:03,029 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:04,030 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:04,033 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:04,033 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:05,034 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:05,037 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:05,037 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:06,038 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:06,041 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:06,041 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:07,042 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:07,045 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:07,045 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:08,046 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:08,049 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:08,049 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:09,050 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:09,054 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:09,054 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:10,055 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:10,058 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:10,058 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:11,059 INFO     MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000
      -- 2024-09-03 11:51:11,062 INFO     MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9
      -- 2024-09-03 11:51:11,062 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-09-03 11:51:12,063 INFO     MainThread: Metric impala-server.completed-queries.written did not reach value 10 in 10s. Failing...

      Logs attached.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jasonmfehr Jason Fehr
            MikaelSmith Michael Smith
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment