Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
-
ghx-label-2
Description
After IMPALA-12737 "Refactor the Workload Management Initialization Process", custom_cluster/test_query_log.py::TestQueryLogTableHS2::test_flush_on_interval failed with
-- 2024-09-03 11:50:35,147 INFO MainThread: Starting cluster with command: /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/bin/start-impala-cluster.py '--state_store_args=--statestore_update_frequency_ms=50 --statestore_priority_update_frequency_ms=50 --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 --log_dir=/data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests --log_level=1 '--impalad=--shutdown_grace_period_s=0 --shutdown_deadline_s=15' '--impalad_args=--enable_workload_mgmt --query_log_write_interval_s=15 ' '--state_store_args=None ' '--catalogd_args=--enable_workload_mgmt ' --impalad_args=--default_query_options= 11:50:35 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es) 11:50:35 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/statestored.INFO 11:50:35 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/catalogd.INFO 11:50:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.INFO 11:50:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO 11:50:35 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO 11:50:38 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 11:50:38 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 11:50:38 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 11:50:38 MainThread: Debug webpage not yet available: HTTPConnectionPool(host='impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com', port=25000): Max retries exceeded with url: /backends?json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f99f9704d10>: Failed to establish a new connection: [Errno 111] Connection refused',)) 11:50:40 MainThread: Debug webpage did not become available in expected time. 11:50:40 MainThread: Waiting for num_known_live_backends=3. Current value: None 11:50:41 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 11:50:41 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 11:50:41 MainThread: Debug webpage not yet available: HTTPConnectionPool(host='impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com', port=25000): Max retries exceeded with url: /backends?json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f99f97037d0>: Failed to establish a new connection: [Errno 111] Connection refused',)) 11:50:43 MainThread: Debug webpage did not become available in expected time. 11:50:43 MainThread: Waiting for num_known_live_backends=3. Current value: None 11:50:44 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 11:50:44 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 11:50:44 MainThread: num_known_live_backends has reached value: 3 11:50:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 11:50:45 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25001 11:50:45 MainThread: num_known_live_backends has reached value: 3 11:50:45 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) 11:50:45 MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25002 11:50:45 MainThread: num_known_live_backends has reached value: 3 11:50:46 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 executors). -- 2024-09-03 11:50:46,553 DEBUG MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) -- 2024-09-03 11:50:46,553 INFO MainThread: Getting metric: statestore.live-backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25010 -- 2024-09-03 11:50:46,556 INFO MainThread: Metric 'statestore.live-backends' has reached desired value: 4 -- 2024-09-03 11:50:46,556 DEBUG MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 -- 2024-09-03 11:50:46,558 INFO MainThread: num_known_live_backends has reached value: 3 -- 2024-09-03 11:50:46,558 DEBUG MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25001 -- 2024-09-03 11:50:46,559 INFO MainThread: num_known_live_backends has reached value: 3 -- 2024-09-03 11:50:46,559 DEBUG MainThread: Getting num_known_live_backends from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25002 -- 2024-09-03 11:50:46,561 INFO MainThread: num_known_live_backends has reached value: 3 SET client_identifier=custom_cluster/test_query_log.py::TestQueryLogTableHS2::()::test_flush_on_interval[protocol:hs2|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':5000;'disable_codegen':False;'abort_on_error':1;'exec_single_node_ro; -- connecting to: localhost:21000 -- 2024-09-03 11:50:46,561 INFO MainThread: Could not connect to ('::1', 21000, 0, 0) Traceback (most recent call last): File "/data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 137, in open handle.connect(sockaddr) File "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", line 228, in meth return getattr(self._sock,name)(*args) error: [Errno 111] Connection refused -- connecting to localhost:21050 with impyla -- 2024-09-03 11:50:46,562 INFO MainThread: Could not connect to ('::1', 21050, 0, 0) Traceback (most recent call last): File "/data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 137, in open handle.connect(sockaddr) File "/data/jenkins/workspace/impala-asf-master-core-s3/Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/socket.py", line 228, in meth return getattr(self._sock,name)(*args) error: [Errno 111] Connection refused -- 2024-09-03 11:50:46,638 INFO MainThread: Closing active operation -- connecting to localhost:28000 with impyla -- 2024-09-03 11:50:46,661 INFO MainThread: Closing active operation -- 2024-09-03 11:50:46,663 INFO MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: } -- 2024-09-03 11:50:47,665 INFO MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: I0903 11:50:46.911687 14130 workload-man -- 2024-09-03 11:50:48,668 INFO MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: I0903 11:50:46.911687 14130 workload-man -- 2024-09-03 11:50:49,670 INFO MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: I0903 11:50:46.911687 14130 workload-man -- 2024-09-03 11:50:50,673 INFO MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com.jenkins.log.INFO.20240903-115035.12755 matching regex 'Completed workload management initialization', but found 0 lines. Last line was: I0903 11:50:46.911687 14130 workload-man -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074f10> -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074f10> -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074f10> -- 2024-09-03 11:50:52,884 INFO MainThread: Closing active operation -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074810> -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074810> -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074810> -- 2024-09-03 11:50:53,898 INFO MainThread: Closing active operation -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0> -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0> -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0> -- 2024-09-03 11:50:54,913 INFO MainThread: Closing active operation -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210> -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210> -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210> -- 2024-09-03 11:50:55,927 INFO MainThread: Closing active operation -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0> -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0> -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0> -- 2024-09-03 11:50:56,941 INFO MainThread: Closing active operation -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0> -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0> -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0> -- 2024-09-03 11:50:57,956 INFO MainThread: Closing active operation -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210> -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210> -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210> -- 2024-09-03 11:50:58,970 INFO MainThread: Closing active operation -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0> -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0> -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074dd0> -- 2024-09-03 11:50:59,984 INFO MainThread: Closing active operation -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0> -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0> -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f81500746d0> -- 2024-09-03 11:51:00,998 INFO MainThread: Closing active operation -- getting log for operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210> -- getting runtime profile operation: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210> -- closing query for operation handle: <tests.common.impala_connection.OperationHandle object at 0x7f8150074210> -- 2024-09-03 11:51:02,011 INFO MainThread: Closing active operation -- 2024-09-03 11:51:02,012 INFO MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 -- 2024-09-03 11:51:02,015 INFO MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 0 -- 2024-09-03 11:51:02,015 INFO MainThread: Sleeping 1s before next retry. -- 2024-09-03 11:51:03,016 INFO MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 -- 2024-09-03 11:51:03,029 INFO MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 0 -- 2024-09-03 11:51:03,029 INFO MainThread: Sleeping 1s before next retry. -- 2024-09-03 11:51:04,030 INFO MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 -- 2024-09-03 11:51:04,033 INFO MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9 -- 2024-09-03 11:51:04,033 INFO MainThread: Sleeping 1s before next retry. -- 2024-09-03 11:51:05,034 INFO MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 -- 2024-09-03 11:51:05,037 INFO MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9 -- 2024-09-03 11:51:05,037 INFO MainThread: Sleeping 1s before next retry. -- 2024-09-03 11:51:06,038 INFO MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 -- 2024-09-03 11:51:06,041 INFO MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9 -- 2024-09-03 11:51:06,041 INFO MainThread: Sleeping 1s before next retry. -- 2024-09-03 11:51:07,042 INFO MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 -- 2024-09-03 11:51:07,045 INFO MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9 -- 2024-09-03 11:51:07,045 INFO MainThread: Sleeping 1s before next retry. -- 2024-09-03 11:51:08,046 INFO MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 -- 2024-09-03 11:51:08,049 INFO MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9 -- 2024-09-03 11:51:08,049 INFO MainThread: Sleeping 1s before next retry. -- 2024-09-03 11:51:09,050 INFO MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 -- 2024-09-03 11:51:09,054 INFO MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9 -- 2024-09-03 11:51:09,054 INFO MainThread: Sleeping 1s before next retry. -- 2024-09-03 11:51:10,055 INFO MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 -- 2024-09-03 11:51:10,058 INFO MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9 -- 2024-09-03 11:51:10,058 INFO MainThread: Sleeping 1s before next retry. -- 2024-09-03 11:51:11,059 INFO MainThread: Getting metric: impala-server.completed-queries.written from impala-ec2-centos79-m6i-4xlarge-xldisk-0f51.vpc.cloudera.com:25000 -- 2024-09-03 11:51:11,062 INFO MainThread: Waiting for metric value 'impala-server.completed-queries.written'=10. Current value: 9 -- 2024-09-03 11:51:11,062 INFO MainThread: Sleeping 1s before next retry. -- 2024-09-03 11:51:12,063 INFO MainThread: Metric impala-server.completed-queries.written did not reach value 10 in 10s. Failing...
Logs attached.
Attachments
Attachments
Issue Links
- relates to
-
IMPALA-13347 TSAN failure in backend tests after IMPALA-12737
- Resolved