Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12550

test_statestored_auto_failover_with_disabling_network flaky

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 4.4.0
    • Impala 4.4.0
    • Backend
    • None
    • ghx-label-8

    Description

      TestStatestoredHA.test_statestored_auto_failover_with_disabling_network failed with following stack trace when repeatedly run this test.
      tests/custom_cluster/test_statestored_ha.py:645: in test_statestored_auto_failover_with_disabling_network
      "statestore.in-ha-recovery-mode", expected_value=False, timeout=120)
      tests/common/impala_service.py:144: in wait_for_metric_value
      self.__metric_timeout_assert(metric_name, expected_value, timeout)
      tests/common/impala_service.py:213: in __metric_timeout_assert
      assert 0, assert_string
      E AssertionError: Metric statestore.in-ha-recovery-mode did not reach value False in 120s.

      From log messages, the issue was caused by the delay of HA Handshake RPC between two statestore instances. Sometimes the active statestore took a few minutes to response the handshake requests from standby statestore.

      This issue is different from IMPALA-12525, which was caused locking issue on subscribers side.

      Attachments

        Issue Links

          Activity

            People

              wzhou Wenzhe Zhou
              wzhou Wenzhe Zhou
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: