Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-790

Implement AutoCloseable on TraversalSource

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Implemented
    • Affects Version/s: 3.0.2-incubating
    • Fix Version/s: 3.2.3
    • Component/s: process
    • Labels:
      None

      Description

      A TraversalSource may have resources to release so having a close method would allows that release to occur. The specific case has to do with EventStrategy which registers a listener on Transaction. That listener should be removed when the user is done with the TraversalSource, but there is currently no way to clean that up. The calling of close should prevent future traversals from that TraversalSource.

      I suppose this also means that a TraversalStrategy will need to optionally implement AutoCloseable or some other marker interface to designate it as a strategy that needs to release resources.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/tinkerpop/pull/437

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tinkerpop/pull/437
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user twilmes commented on the issue:

          https://github.com/apache/tinkerpop/pull/437

          VOTE: +1

          Show
          githubbot ASF GitHub Bot added a comment - Github user twilmes commented on the issue: https://github.com/apache/tinkerpop/pull/437 VOTE: +1
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user spmallette commented on the issue:

          https://github.com/apache/tinkerpop/pull/437

          Just rebased on master to clean up conflicts and force pushed.

          Show
          githubbot ASF GitHub Bot added a comment - Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/437 Just rebased on master to clean up conflicts and force pushed.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user okram commented on the issue:

          https://github.com/apache/tinkerpop/pull/437

          Good. Was scared at first seeing "new methods" on Traversal and TraversalSource, but I think this is smart. Gotz to release da resourcezzzzzzz...

          VOTE +1.

          Show
          githubbot ASF GitHub Bot added a comment - Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/437 Good. Was scared at first seeing "new methods" on Traversal and TraversalSource, but I think this is smart. Gotz to release da resourcezzzzzzz... VOTE +1.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user davebshow commented on the issue:

          https://github.com/apache/tinkerpop/pull/437

          Good stuff, I should get there today.

          Show
          githubbot ASF GitHub Bot added a comment - Github user davebshow commented on the issue: https://github.com/apache/tinkerpop/pull/437 Good stuff, I should get there today.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user spmallette opened a pull request:

          https://github.com/apache/tinkerpop/pull/437

          TINKERPOP-790 TraversalSource and Traversal implement AutoCloseable.

          https://issues.apache.org/jira/browse/TINKERPOP-790

          This allowed for `TraversalSource` instances created using `withRemote()` to close out open `RemoteConnection` instances. It also sets up for side-effects created by a "remote" `Traversal` to be cleared from cache on the server. That work is not done yet and is reserved for a different ticket.

          Builds with `$ mvn clean install && mvn verify -pl gremlin-server -DskipIntegrationTests=false`

          I won't VOTE on this yet as I suspect there might be some feedback from @okram perhaps.

          @davebshow note the placeholder for sending the "close" message for side-effects.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/apache/tinkerpop TINKERPOP-790

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/tinkerpop/pull/437.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #437


          commit 14211691effffec1f9fdf6ddd92c5833dff234de
          Author: Stephen Mallette <spmva@genoprime.com>
          Date: 2016-09-25T02:18:56Z

          TraversalSource and Traversal implement AutoCloseable.

          This allowed for TraversalSource instances created using withRemote() to close out open RemoteConnection instances. It also sets up for side-effects created by a "remote" Traversal to be cleared from cache on the server. That work is not done yet and is reserved for a different ticket.


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user spmallette opened a pull request: https://github.com/apache/tinkerpop/pull/437 TINKERPOP-790 TraversalSource and Traversal implement AutoCloseable. https://issues.apache.org/jira/browse/TINKERPOP-790 This allowed for `TraversalSource` instances created using `withRemote()` to close out open `RemoteConnection` instances. It also sets up for side-effects created by a "remote" `Traversal` to be cleared from cache on the server. That work is not done yet and is reserved for a different ticket. Builds with `$ mvn clean install && mvn verify -pl gremlin-server -DskipIntegrationTests=false` I won't VOTE on this yet as I suspect there might be some feedback from @okram perhaps. @davebshow note the placeholder for sending the "close" message for side-effects. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/tinkerpop TINKERPOP-790 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/437.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #437 commit 14211691effffec1f9fdf6ddd92c5833dff234de Author: Stephen Mallette <spmva@genoprime.com> Date: 2016-09-25T02:18:56Z TraversalSource and Traversal implement AutoCloseable. This allowed for TraversalSource instances created using withRemote() to close out open RemoteConnection instances. It also sets up for side-effects created by a "remote" Traversal to be cleared from cache on the server. That work is not done yet and is reserved for a different ticket.
          Hide
          okram Marko A. Rodriguez added a comment -

          If we are going to do this, we should do it for 3.2.0 as we are already refactored TraversalSource TINKERPOP-971.

          Show
          okram Marko A. Rodriguez added a comment - If we are going to do this, we should do it for 3.2.0 as we are already refactored TraversalSource TINKERPOP-971 .
          Hide
          mikepersonick Mike Personick added a comment -

          I feel very strongly that any open-ended result sets provided by the graph must be assumed to be backed by a database resource that needs to be closed. It's true for Blazegraph and it will certainly be true for others. When you ask for a tuple iterator on a database you must close it, and that is what asking for an iterator over edges, vertices, properties is. Furthermore, closing the tuple iterator does not close your connection to the database or close the database - these things might be backed by different resources that are shared across queries or across connections. If you ask for a query result iterator, you close it when you're done with it. If you ask for a connection to the database, you close it when you're done with it. If you open a handle to the database itself, you close it when you're done with it. Closing has a cascading effect down that hierarchy, but not up (closing a connection will close and result set iterators associated with that connection but will not close the database itself).

          This is an issue in the Graph API, where many methods return simple Iterators, which are not AutoCloseable. Materializing the result set from a call to vertices() or edges() fully into memory first to manage this problem is not a viable solution for a large graph. For Blazegraph, I hid a close() inside the last call to next(), but this does not protect callers that do not fully drain the iterator. Using GC/finalization to manage these resources is definitely not a viable solution.

          I have not looking at TraversalStrategy yet but I assume it will present a similar problem.

          Show
          mikepersonick Mike Personick added a comment - I feel very strongly that any open-ended result sets provided by the graph must be assumed to be backed by a database resource that needs to be closed. It's true for Blazegraph and it will certainly be true for others. When you ask for a tuple iterator on a database you must close it, and that is what asking for an iterator over edges, vertices, properties is. Furthermore, closing the tuple iterator does not close your connection to the database or close the database - these things might be backed by different resources that are shared across queries or across connections. If you ask for a query result iterator, you close it when you're done with it. If you ask for a connection to the database, you close it when you're done with it. If you open a handle to the database itself, you close it when you're done with it. Closing has a cascading effect down that hierarchy, but not up (closing a connection will close and result set iterators associated with that connection but will not close the database itself). This is an issue in the Graph API, where many methods return simple Iterators, which are not AutoCloseable. Materializing the result set from a call to vertices() or edges() fully into memory first to manage this problem is not a viable solution for a large graph. For Blazegraph, I hid a close() inside the last call to next(), but this does not protect callers that do not fully drain the iterator. Using GC/finalization to manage these resources is definitely not a viable solution. I have not looking at TraversalStrategy yet but I assume it will present a similar problem.
          Hide
          spmallette stephen mallette added a comment -

          This is turning into a brute of a change. When we did the hangout to discuss what could be done for 3.1.0 prior to code freeze, I'd thought I could simplify the issue by trying to just focus on implementing AutoCloseable on EventStrategy so that we could at least release the resources held on the Graph instance, but that isn't turning out to be very testable given the current way things are setup.

          Given that:

          1. we are close to code freeze and this is remaining a "big" issue
          2. david robinson has brought up some issues related to how we are dealing with eventing
          3. marko is gone for the next couple of days and this starts to get into territory he thinks about constantly in TraversalSource

          I think it's best if I not introduce anything new here for 3.1.0 - it's best we leave this until the issue has clear resolution.

          Show
          spmallette stephen mallette added a comment - This is turning into a brute of a change. When we did the hangout to discuss what could be done for 3.1.0 prior to code freeze, I'd thought I could simplify the issue by trying to just focus on implementing AutoCloseable on EventStrategy so that we could at least release the resources held on the Graph instance, but that isn't turning out to be very testable given the current way things are setup. Given that: 1. we are close to code freeze and this is remaining a "big" issue 2. david robinson has brought up some issues related to how we are dealing with eventing 3. marko is gone for the next couple of days and this starts to get into territory he thinks about constantly in TraversalSource I think it's best if I not introduce anything new here for 3.1.0 - it's best we leave this until the issue has clear resolution.
          Hide
          spmallette stephen mallette added a comment -

          In relation to the Matt Frantz comments on the hierarchy - we also have this related issue: TINKERPOP3-789

          Show
          spmallette stephen mallette added a comment - In relation to the Matt Frantz comments on the hierarchy - we also have this related issue: TINKERPOP3-789
          Hide
          spmallette stephen mallette added a comment -

          I had expected that some TraversalStrategy would implement AutoCloseable as needed and those that did would have close() called - basically a lightweight version of the full hierarchy concept {Matt Frantz mentioned. I hadn't come across a spot yet where a GraphTraversal needs to "close". Seems like it wouldn't be bad for the GraphTraversalSource to just spawn independent Traversal objects. That doesn't seem too confusing to me, but now that it's been mentioned, it might not be a bad idea to consider if we can think of a use case where it would matter.

          Show
          spmallette stephen mallette added a comment - I had expected that some TraversalStrategy would implement AutoCloseable as needed and those that did would have close() called - basically a lightweight version of the full hierarchy concept { Matt Frantz mentioned. I hadn't come across a spot yet where a GraphTraversal needs to "close". Seems like it wouldn't be bad for the GraphTraversalSource to just spawn independent Traversal objects. That doesn't seem too confusing to me, but now that it's been mentioned, it might not be a bad idea to consider if we can think of a use case where it would matter.
          Hide
          mhfrantz Matt Frantz added a comment -

          Decide which of the objects will be responsible for others.

          What is the lifecycle of a GraphTraversal that I create from a GraphTraversalSource? Am I allowed to close the GraphTraversalSource but still use the GraphTraversal? Or is there a parent-child relationship?

          Identify reusable objects in many-to-many relationships.

          Can I configure and install a single TraversalStrategy instance on multiple Traversal instances?

          A lifecycle hierarchy would be easier for developers to reason about, even if it is not as flexible as other lifecycle schema. The idea would be that calling close on something higher in the hierarchy will close everything below it, and render any subsequent API calls on closed objects as "unsupported" or "undefined" behavior. Could the hierarchy be something like this?

          • Graph
          • GraphTraversalSource
          • TraversalStrategy
          • GraphTraversal
          • Transaction

          Removing something, e.g. TraversalStrategy or Transaction, from the hierarchy would open up new use cases, at the expense of increasing the behavior surface area and complicating the rules that developers must follow regarding when to call close.

          Moving a closable API up or down within the hierarchy would result in an equally complex hierarchy, albeit one with different use cases.

          Using a schema other than a hierarchy means that developers would be encouraged to call close more often, as each independently closable thing is allowed to have expensive resources.

          A developer can choose to allow GC to close its resources, but that choice can lead to more surprising behavior resulting from the nondeterminism of GC, dynamic vendor implementations, new vendor versions, and cross-vendor skew.

          Show
          mhfrantz Matt Frantz added a comment - Decide which of the objects will be responsible for others. What is the lifecycle of a GraphTraversal that I create from a GraphTraversalSource ? Am I allowed to close the GraphTraversalSource but still use the GraphTraversal ? Or is there a parent-child relationship? Identify reusable objects in many-to-many relationships. Can I configure and install a single TraversalStrategy instance on multiple Traversal instances? — A lifecycle hierarchy would be easier for developers to reason about, even if it is not as flexible as other lifecycle schema. The idea would be that calling close on something higher in the hierarchy will close everything below it, and render any subsequent API calls on closed objects as "unsupported" or "undefined" behavior. Could the hierarchy be something like this? Graph GraphTraversalSource TraversalStrategy GraphTraversal Transaction Removing something, e.g. TraversalStrategy or Transaction , from the hierarchy would open up new use cases, at the expense of increasing the behavior surface area and complicating the rules that developers must follow regarding when to call close . Moving a closable API up or down within the hierarchy would result in an equally complex hierarchy, albeit one with different use cases. Using a schema other than a hierarchy means that developers would be encouraged to call close more often, as each independently closable thing is allowed to have expensive resources. A developer can choose to allow GC to close its resources, but that choice can lead to more surprising behavior resulting from the nondeterminism of GC, dynamic vendor implementations, new vendor versions, and cross-vendor skew.
          Hide
          okram Marko A. Rodriguez added a comment -

          I can easily make TraversalSource extend AutoCloseable. The question is, how would you update (lets say) GraphTravesalSource to clean up EventStrategy stuff? Seems we would need strategies to also be able to implement AutoCloseable and if so, on TraversalSource.close() loop through strategies and if they can be closed (close them). Thoughts?

          Show
          okram Marko A. Rodriguez added a comment - I can easily make TraversalSource extend AutoCloseable . The question is, how would you update (lets say) GraphTravesalSource to clean up EventStrategy stuff? Seems we would need strategies to also be able to implement AutoCloseable and if so, on TraversalSource.close() loop through strategies and if they can be closed (close them). Thoughts?

            People

            • Assignee:
              spmallette stephen mallette
              Reporter:
              spmallette stephen mallette
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development