Uploaded image for project: 'REEF (Retired)'
  1. REEF (Retired)
  2. REEF-1679

Evaluator shouldn't go to recovery mode if there is no reconnect logic provided

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.16
    • None

    Description

      Current behavior of .NET Evaluator is as follows: if evaluator can't send heartbeat to driver 3 times in row (which takes about 8 seconds), it considers driver dead/unreachable and enters recovery mode. However, if the code doesn't provide logic for handling reconnects, IDriverConnection uses default implementation MissingDriverConnection, which promptly throws NotImplementedException. The evaluator continues to try sending heartbeats which (in recovery mode already) continue to throw exception, so the evaluator loses any chance to reconnect to the driver and just hangs there indefinitely.

      We should fix this by checking whether there is a non-default implementation bound for IDriverConnection. If there is one, we should enter recovery mode as before. But if there is none, we know that there's no point going to recovery; instead we should try to talk to driver some more, and then fail evaluator to avoid wasting resources.

      Attachments

        Activity

          People

            juliaw Julia Wang
            MariiaMykhailova Mariia Mykhailova
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: