Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-13705

Another node fails with failure of target node.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.9.1
    • None
    • Release Notes Required

    Description

      The discovery ducktape test [1] has detected unexpected failure of another node.

      Scenario:
      The nodes have relative places in the ring: N and N+1. Node N detects failure of node N+1. Node N tries to connect to node N+2. Node N+2 checks backward connection to node N+1.

      Problem:
      Node N can fail too.

      Cause:
      The timeout on node N to recover connection to node N+2 appears shorter than timeout on node N+2 to check connection to N+1.

      Fix:
      Introduced a fundamental timeout value to check/recover connection based on current configuration. The mentioned timeouts have been turned relative. The timeout of backward connection check is now generally shorter than the timeout to recover connection.

      [1] https://github.com/apache/ignite/blob/ignite-ducktape/modules/ducktests/tests/ignitetest/tests/discovery_test.py

      Attachments

        Issue Links

          Activity

            People

              vladsz83 Vladimir Steshin
              vladsz83 Vladimir Steshin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 10m
                  2h 10m