Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-13014

Remove double checking of node availability.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Invalid
    • None
    • None
    • None
    • None
    • Release Notes Required

    Description

      Proposal:
      Do not check failed node second time. Double node checking prolongs node failure detection and gives no additional benefits. There are mesh and hardcoded values in this routine.

      For the present, we have double checking of node availability. Let's imagine node 2 doesn't answer any more. Node 1 becomes unable to ping node 2 and asks Node 3 to establish permanent connection instead of node 2. Node 3 may try to check node 2 too. Or may not.

      Possible long detection of node failure up to ServerImpl.CON_CHECK_INTERVAL + 2 * IgniteConfiguretion.failureDetectionTimeout + 300ms.

      See:

      • ‘NodeFailureResearch.patch'. It creates test 'FailureDetectionResearch' which emulates long answears on a failed node and measures failure detection delays.
      • 'FailureDetectionResearch.txt' - results of the test.
      • 'WostCaseStepByStep.txt' - description how the worst case happens.

      Attachments

        1. FailureDetectionResearch.txt
          0.3 kB
          Vladimir Steshin
        2. NodeFailureResearch.patch
          155 kB
          Vladimir Steshin
        3. WostCaseStepByStep.txt
          5 kB
          Vladimir Steshin

        Issue Links

          There are no Sub-Tasks for this issue.

          Activity

            People

              vladsz83 Vladimir Steshin
              vladsz83 Vladimir Steshin
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m