Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-2569

Reconnect to server if Java driver fails to initialize

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 3.4.11
    • 3.6.0, 3.4.13, 3.5.2
    • driver
    • None

    Description

      As reported here on SO: https://stackoverflow.com/questions/67586427/how-to-recover-with-a-retry-from-gremlin-nohostavailableexception

      If the host is unavailable at Client initialization then the host is not put in a state where reconnect is possible. Essentially, this test for GremlinServerIntegrateTest should pass:

      @Test
          public void shouldFailOnInitiallyDeadHost() throws Exception {
      
              // start test with no server
              this.stopServer();
      
              final Cluster cluster = TestClientFactory.build().create();
              final Client client = cluster.connect();
      
              try {
                  // try to re-issue a request now that the server is down
                  client.submit("g").all().get(3000, TimeUnit.MILLISECONDS);
                  fail("Should throw an exception.");
              } catch (RuntimeException re) {
                  // Client would have no active connections to the host, hence it would encounter a timeout
                  // trying to find an alive connection to the host.
                  assertThat(re.getCause(), instanceOf(NoHostAvailableException.class));
      
                  //
                  // should recover when the server comes back
                  //
      
                  // restart server
                  this.startServer();
      
                  // try a bunch of times to reconnect. on slower systems this may simply take longer...looking at you travis
                  for (int ix = 1; ix < 11; ix++) {
                      // the retry interval is 1 second, wait a bit longer
                      TimeUnit.SECONDS.sleep(5);
      
                      try {
                          final List<Result> results = client.submit("1+1").all().get(3000, TimeUnit.MILLISECONDS);
                          assertEquals(1, results.size());
                          assertEquals(2, results.get(0).getInt());
                      } catch (Exception ex) {
                          if (ix == 10)
                              fail("Should have eventually succeeded");
                      }
                  }
              } finally {
                  cluster.close();
              }
          }
      

      Note that there is a similar test that first allows a connect to a host and then kills it and then restarts it again called shouldFailOnDeadHost() which demonstrates that reconnection works in that situation.

      I thought it might be an easy to fix to simply call considerHostUnavailable() in the ConnectionPool constructor in the event of a CompletionException which should kickstart the reconnect process. The reconnects started firing but they all failed for some reason. I didn't have time to investigate further than than.

      Currently the only workaround is to recreate the `Client` if this sort of situation occurs.

      Attachments

        Activity

          People

            spmallette Stephen Mallette
            spmallette Stephen Mallette
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: