Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-1127

client fails to reconnect to restarted server

    XMLWordPrintableJSON

Details

    Description

      If a gremlin-server is restarted, the client will never reconnect to it.

      Start server1
      Start server2
      Start client such as

              GryoMapper kryo = GryoMapper.build().addRegistry(TitanIoRegistry.INSTANCE).create();
              MessageSerializer serializer = new GryoMessageSerializerV1d0(kryo);
              Cluster titanCluster = Cluster.build()
                      .addContactPoints("54.X.X.X,54.Y.Y.Y".split(","))
                      .port(8182)
                      .minConnectionPoolSize(5)
                      .maxConnectionPoolSize(10)
                      .reconnectIntialDelay(1000)
                      .reconnectInterval(30000)
                      .serializer(serializer)
                      .create();
              Client client = titanCluster.connect();
              client.init();
      
              System.out.println("initialized");
              for (int i = 0; i < 200; i++) {
                  try {
                      long id = System.currentTimeMillis();
                      ResultSet results = client.submit("graph.addVertex('a','" + id + "')");
                      results.one();
                      results = client.submit("g.V().has('a','" + id + "')");
                      System.out.println(results.one());
                  } catch (Exception e) {
                      e.printStackTrace();
                  }
      
                  try {
                      TimeUnit.SECONDS.sleep(3);
                  } catch (InterruptedException e) {
                      e.printStackTrace();
                  }
              }
      
              System.out.println("done");
              client.close();
      
              System.exit(0);
          }
      


      After client has performed a couple of query cycles
      Restart server1
      Wait 60 seconds so the reconnect should occur
      stop server2
      Notice that there are no more successful queries, the client has never reconnected to server1
      start server2
      Notice that still there are no more successful queries

      The method ConnectionPool.addConnectionIfUnderMaximum is always returning false because opened >= maxPoolSize. In this particular case opened = 10. I believe that open is trying to track the size of the List of connections but is getting out of sync. The following diff addresses this problem for this particular case

      Unable to find source-code formatter for language: diff. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
      diff --git a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java
      index 96c151c..81ce81d 100644
      --- a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java
      +++ b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java
      @@ -326,6 +326,7 @@ final class ConnectionPool {
           private void definitelyDestroyConnection(final Connection connection) {
               bin.add(connection);
               connections.remove(connection);
      +        open.decrementAndGet();
      
               if (connection.borrowed.get() == 0 && bin.remove(connection))
                   connection.closeAsync();
      @@ -388,6 +389,8 @@ final class ConnectionPool {
      
               // if the host is unavailable then we should release the connections
               connections.forEach(this::definitelyDestroyConnection);
      +        // there are no connections open
      +        open.set(0);
      
               // let the load-balancer know that the host is acting poorly
               this.cluster.loadBalancingStrategy().onUnavailable(host);
      @@ -413,6 +416,7 @@ final class ConnectionPool {
                   this.cluster.loadBalancingStrategy().onAvailable(host);
                   return true;
               } catch (Exception ex) {
      +            logger.debug("Failed reconnect attempt on {}", host);
                   if (connection != null) definitelyDestroyConnection(connection);
                   return false;
               }
      

      Attachments

        Issue Links

          Activity

            People

              spmallette Stephen Mallette
              kieransherlock Kieran Sherlock
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: