Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 4.4.0
-
None
-
ghx-label-8
Description
TestStatestoredHA.test_statestored_auto_failover_with_disabling_network failed with following stack trace when repeatedly run this test.
tests/custom_cluster/test_statestored_ha.py:645: in test_statestored_auto_failover_with_disabling_network
"statestore.in-ha-recovery-mode", expected_value=False, timeout=120)
tests/common/impala_service.py:144: in wait_for_metric_value
self.__metric_timeout_assert(metric_name, expected_value, timeout)
tests/common/impala_service.py:213: in __metric_timeout_assert
assert 0, assert_string
E AssertionError: Metric statestore.in-ha-recovery-mode did not reach value False in 120s.
From log messages, the issue was caused by the delay of HA Handshake RPC between two statestore instances. Sometimes the active statestore took a few minutes to response the handshake requests from standby statestore.
This issue is different from IMPALA-12525, which was caused locking issue on subscribers side.
Attachments
Issue Links
- is caused by
-
IMPALA-12156 Support Impala Statestore HA
- Resolved