Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-1502

Chained has()-steps should simply left-append HasContainers in Gremlin-Java.

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.2.2
    • Fix Version/s: 3.2.4
    • Component/s: process
    • Labels:
      None

      Description

      In Gremlin-Java, g.V().has(a).has(b).has(c).out() is originally represented as [GraphStep,HasStep(a),HasStep(b),HasStep(c),VertexStep]. Ultimately, InlineFilterStrategy or most provider strategies will turn such HasStep-chains into [GraphStep,HasStep(a,b,c),VertexStep]. That is, strategies fold has()-steps "left" and delete "right" has()-steps and left propagates their labels (i.e. clock cycles). I think that GraphTraversal should simply do this:

      public GraphTraversal has(whateves) {
        if(this.getEndStep() instanceof HasStep)
          this.getEndSte().addHasContainer(new HasContainer(whateves))
        else
          this.addStep(new HasStep(new HasContainer(whateves)));
        this.bytecode.addStep("has",whateves);
        return this;
      }
      

      In essence, a "write time" optimization can be done. Given that chains of has()'s is super common, this can save significant clock-cycles in the long run of a production application.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user okram opened a pull request:

          https://github.com/apache/tinkerpop/pull/495

          TINKERPOP-1502 & TINKERPOP-1482: Chained has-containers and AndP stringency

          https://issues.apache.org/jira/browse/TINKERPOP-1502
          https://issues.apache.org/jira/browse/TINKERPOP-1482

          This ticket accomplishes various things around `GraphTraversal.has()`.

          1.) A write-time optimization is provided where if a `HasStep` is the traversal's end step, simply `HasStep.addHasContainer()`. This reduces the time required by `InlineFilterStrategy` and provider strategies to fold in `HasContainers` and remove `HasSteps`.

          2.) `GraphTraversal.hasXXX()` was tweaked by @dkuppitz a release ago. He used `Object` to determine numerous things like `String`, `P`, etc. This is bad. We want the `GraphTraversal` API to be explicit so I added stricter typing: e.g. `hasLabel(String,String...)` and `hasLabel(P<String>)`.

          3.) `HasContainer.makeHasContainers()` was a TinkerPop hack trying to help providers and it shouldn't be done. This method turned `AndP(x,y)` into individual `x` and `y` predicates so providers didn't have to analyze `AndP` trees. We shouldn't do this as the provider will be getting `AndP` compositions elsewhere and will need to be able to handle it. I updated `Neo4jGraphStep` and `TinkerGraphStep` to process `AndP` predicates accordingly for index lookups. Added a note to upgrade docs for other providers — simple update if they want to use the old "break apart `AndP`"-model.

          4.) Found a bug in `InlineFilterStrategy` around `outE().hasLabel()` where `outE('knows').hasLabel('created')` was turned into `outE('knows','created')` and thus, semantically incorrect.

          Going to add a few more tests before VOTING.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/apache/tinkerpop TINKERPOP-1502

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/tinkerpop/pull/495.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #495


          commit 83f2ddb6629eab9c4dca8c4103dce60a3b8fc7a6
          Author: Marko A. Rodriguez <okrammarko@gmail.com>
          Date: 2016-11-14T19:54:07Z

          first push on 'left-append' of has() containers.

          commit d471f2d22fe0e040504ab98a1735ad9a6edaa692
          Author: Marko A. Rodriguez <okrammarko@gmail.com>
          Date: 2016-11-14T20:46:00Z

          fixed up the hasXXX() methods to account of P and Object differently. Will go through and create a TraversalHelper.addHasContainer() method which will left append or right append depending on Traversal state. this will simplify methods signficiantly.

          commit 07f264241fa7a1124faac27a06df83a7449b75c9
          Author: Marko A. Rodriguez <okrammarko@gmail.com>
          Date: 2016-11-14T22:44:40Z

          added TraversalHelper.addHasContainer() which will either append a HasStep with container or if the traverasl ends with a HasContainerHolder, fold the container into the holder. This just makes the code in GraphTravesrsal cleaner with less copy/paste.

          commit e29e1c33fddf2ea245027fe91dec882b8ffdf4d7
          Author: Marko A. Rodriguez <okrammarko@gmail.com>
          Date: 2016-11-15T15:27:24Z

          found a bug in HasContainer.makeHasContainers() around AndP recurssion. No biggie, just didn't yield ultimate optimization. Found a bug in InlineFilterStrategy where hasLabel() should only back propagate into VertexStep[edges] if the step doesn't already have edge labels. Removed HasContainer.makeHasContainers() – another ticket I'm putting into this branch. GraphTraversal.has() is fixed up accordingly with left-fold HasContainers and valid Object[] usage. Going to add a few more tests around hasXXX() steps.


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user okram opened a pull request: https://github.com/apache/tinkerpop/pull/495 TINKERPOP-1502 & TINKERPOP-1482 : Chained has-containers and AndP stringency https://issues.apache.org/jira/browse/TINKERPOP-1502 https://issues.apache.org/jira/browse/TINKERPOP-1482 This ticket accomplishes various things around `GraphTraversal.has()`. 1.) A write-time optimization is provided where if a `HasStep` is the traversal's end step, simply `HasStep.addHasContainer()`. This reduces the time required by `InlineFilterStrategy` and provider strategies to fold in `HasContainers` and remove `HasSteps`. 2.) `GraphTraversal.hasXXX()` was tweaked by @dkuppitz a release ago. He used `Object` to determine numerous things like `String`, `P`, etc. This is bad. We want the `GraphTraversal` API to be explicit so I added stricter typing: e.g. `hasLabel(String,String...)` and `hasLabel(P<String>)`. 3.) `HasContainer.makeHasContainers()` was a TinkerPop hack trying to help providers and it shouldn't be done. This method turned `AndP(x,y)` into individual `x` and `y` predicates so providers didn't have to analyze `AndP` trees. We shouldn't do this as the provider will be getting `AndP` compositions elsewhere and will need to be able to handle it. I updated `Neo4jGraphStep` and `TinkerGraphStep` to process `AndP` predicates accordingly for index lookups. Added a note to upgrade docs for other providers — simple update if they want to use the old "break apart `AndP`"-model. 4.) Found a bug in `InlineFilterStrategy` around `outE().hasLabel()` where `outE('knows').hasLabel('created')` was turned into `outE('knows','created')` and thus, semantically incorrect. Going to add a few more tests before VOTING. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/tinkerpop TINKERPOP-1502 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/495.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #495 commit 83f2ddb6629eab9c4dca8c4103dce60a3b8fc7a6 Author: Marko A. Rodriguez <okrammarko@gmail.com> Date: 2016-11-14T19:54:07Z first push on 'left-append' of has() containers. commit d471f2d22fe0e040504ab98a1735ad9a6edaa692 Author: Marko A. Rodriguez <okrammarko@gmail.com> Date: 2016-11-14T20:46:00Z fixed up the hasXXX() methods to account of P and Object differently. Will go through and create a TraversalHelper.addHasContainer() method which will left append or right append depending on Traversal state. this will simplify methods signficiantly. commit 07f264241fa7a1124faac27a06df83a7449b75c9 Author: Marko A. Rodriguez <okrammarko@gmail.com> Date: 2016-11-14T22:44:40Z added TraversalHelper.addHasContainer() which will either append a HasStep with container or if the traverasl ends with a HasContainerHolder, fold the container into the holder. This just makes the code in GraphTravesrsal cleaner with less copy/paste. commit e29e1c33fddf2ea245027fe91dec882b8ffdf4d7 Author: Marko A. Rodriguez <okrammarko@gmail.com> Date: 2016-11-15T15:27:24Z found a bug in HasContainer.makeHasContainers() around AndP recurssion. No biggie, just didn't yield ultimate optimization. Found a bug in InlineFilterStrategy where hasLabel() should only back propagate into VertexStep [edges] if the step doesn't already have edge labels. Removed HasContainer.makeHasContainers() – another ticket I'm putting into this branch. GraphTraversal.has() is fixed up accordingly with left-fold HasContainers and valid Object[] usage. Going to add a few more tests around hasXXX() steps.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user dkuppitz commented on the issue:

          https://github.com/apache/tinkerpop/pull/495

          ```
          gremlin> g.V(1).outE("knows")
          ==>e[7][1-knows->2]
          ==>e[8][1-knows->4]
          gremlin> g.V(1).outE("knows").hasLabel("created")
          ==>e[9][1-created->3]
          ```

          This is still wrong. No `knows` edge will have a `created` label. The result should be empty. Furthermore I think that this is wrong:

          ```
          gremlin> g.V(1).outE().hasLabel("knows").hasLabel("created")
          ==>e[7][1-knows->2]
          ==>e[8][1-knows->4]
          ==>e[9][1-created->3]
          ```

          It should all boil down to `hasLabel(eq("knows").and(eq("created")))` / `has(label, eq("knows").and(eq("created")))`. Chained `has()` calls are / should be `and()`'ed, not `or()`'ed.

          Show
          githubbot ASF GitHub Bot added a comment - Github user dkuppitz commented on the issue: https://github.com/apache/tinkerpop/pull/495 ``` gremlin> g.V(1).outE("knows") ==>e [7] [1-knows->2] ==>e [8] [1-knows->4] gremlin> g.V(1).outE("knows").hasLabel("created") ==>e [9] [1-created->3] ``` This is still wrong. No `knows` edge will have a `created` label. The result should be empty. Furthermore I think that this is wrong: ``` gremlin> g.V(1).outE().hasLabel("knows").hasLabel("created") ==>e [7] [1-knows->2] ==>e [8] [1-knows->4] ==>e [9] [1-created->3] ``` It should all boil down to `hasLabel(eq("knows").and(eq("created")))` / `has(label, eq("knows").and(eq("created")))`. Chained `has()` calls are / should be `and()`'ed, not `or()`'ed.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user okram commented on the issue:

          https://github.com/apache/tinkerpop/pull/495

          What you have in your examples is the "bug" that is fixed in this PR:

          ```
          gremlin> g.V(1).outE('knows').hasLabel('created')
          gremlin>
          ```

          Likewise:

          ```
          gremlin> g.V(1).outE().hasLabel("knows").hasLabel("created")
          gremlin>
          ```

          ```
          gremlin> g.V(1).outE().hasLabel("knows")
          ==>e[7][1-knows->2]
          ==>e[8][1-knows->4]
          gremlin> g.V(1).outE().hasLabel("knows").explain()
          ==>Traversal Explanation
          =============================================================================================================
          Original Traversal [GraphStep(vertex,[1]), VertexStep(OUT,edge), HasStep([~label.eq(knows)])]
          ...
          InlineFilterStrategy [O] [GraphStep(vertex,[1]), VertexStep(OUT,[knows],edge)]
          ...
          Final Traversal [TinkerGraphStep(vertex,[1]), VertexStep(OUT,[knows],edge)]
          gremlin>
          ```

          Show
          githubbot ASF GitHub Bot added a comment - Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/495 What you have in your examples is the "bug" that is fixed in this PR: ``` gremlin> g.V(1).outE('knows').hasLabel('created') gremlin> ``` Likewise: ``` gremlin> g.V(1).outE().hasLabel("knows").hasLabel("created") gremlin> ``` ``` gremlin> g.V(1).outE().hasLabel("knows") ==>e [7] [1-knows->2] ==>e [8] [1-knows->4] gremlin> g.V(1).outE().hasLabel("knows").explain() ==>Traversal Explanation ============================================================================================================= Original Traversal [GraphStep(vertex, [1] ), VertexStep(OUT,edge), HasStep( [~label.eq(knows)] )] ... InlineFilterStrategy [O] [GraphStep(vertex, [1] ), VertexStep(OUT, [knows] ,edge)] ... Final Traversal [TinkerGraphStep(vertex, [1] ), VertexStep(OUT, [knows] ,edge)] gremlin> ```
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user dkuppitz commented on the issue:

          https://github.com/apache/tinkerpop/pull/495

          Alright, looks like I pulled too early, just got some more changes coming through. I will retest the latest.

          Show
          githubbot ASF GitHub Bot added a comment - Github user dkuppitz commented on the issue: https://github.com/apache/tinkerpop/pull/495 Alright, looks like I pulled too early, just got some more changes coming through. I will retest the latest.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user dkuppitz commented on the issue:

          https://github.com/apache/tinkerpop/pull/495

          Yea, looks better now. Not sure if we should really keep all the steps in `g.V(1).outE('knows').hasLabel('created').more().bla()`. It's like the Java compiler would keep stuff within a `if (false)

          { ... }

          ` block. The static analysis already tells us, that nothing will make it through the filter.

          You already treat this particular case in a dedicated code block, why not remove everything that comes after it and append a `.not(identity())`?

          Show
          githubbot ASF GitHub Bot added a comment - Github user dkuppitz commented on the issue: https://github.com/apache/tinkerpop/pull/495 Yea, looks better now. Not sure if we should really keep all the steps in `g.V(1).outE('knows').hasLabel('created').more().bla()`. It's like the Java compiler would keep stuff within a `if (false) { ... } ` block. The static analysis already tells us, that nothing will make it through the filter. You already treat this particular case in a dedicated code block, why not remove everything that comes after it and append a `.not(identity())`?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user okram commented on the issue:

          https://github.com/apache/tinkerpop/pull/495

          We could. However, its such a rare situation ............ too much optimization?

          Show
          githubbot ASF GitHub Bot added a comment - Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/495 We could. However, its such a rare situation ............ too much optimization?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user okram commented on the issue:

          https://github.com/apache/tinkerpop/pull/495

          VOTE +1.

          Show
          githubbot ASF GitHub Bot added a comment - Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/495 VOTE +1.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user dkuppitz commented on the issue:

          https://github.com/apache/tinkerpop/pull/495

          Since this PR touches `FilterRankingStrategyTest`s, it would be nice to have this issue fixed as well (and have some tests added to cover this scenario):

          ```
          gremlin> g = TinkerFactory.createModern().traversal()
          ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
          gremlin> g.V(1).as("a").both().both().has("age").where(neq("a")).by("age")
          The property does not exist as the key has no associated value for the provided element: v[5]:age
          Type ':help' or ':h' for help.
          Display stack trace? [yN]N
          gremlin> g.V(1).as("a").both().both().has("age").where(neq("a")).by("age").explain()
          ==>Traversal Explanation
          ===========================================================================================================================================================================================================================================
          Original Traversal [GraphStep(vertex,[1])@[a], VertexStep(BOTH,vertex), VertexStep(BOTH,vertex), TraversalFilterStep([PropertiesStep([age],value)]), WherePredicateStep(neq(a),[value(age)])]
          ...
          FilterRankingStrategy [O] [GraphStep(vertex,[1])@[a], VertexStep(BOTH,vertex), VertexStep(BOTH,vertex), WherePredicateStep(neq(a),[value(age)]), TraversalFilterStep([PropertiesStep([age],value)])]
          ...
          Final Traversal [TinkerGraphStep(vertex,[1])@[a], VertexStep(BOTH,vertex), VertexStep(BOTH,vertex), WherePredicateStep(neq(a),[value(age)]), NoOpBarrierStep(2500), TraversalFilterStep([PropertiesStep([age],property)])]
          ```

          Show
          githubbot ASF GitHub Bot added a comment - Github user dkuppitz commented on the issue: https://github.com/apache/tinkerpop/pull/495 Since this PR touches `FilterRankingStrategyTest`s, it would be nice to have this issue fixed as well (and have some tests added to cover this scenario): ``` gremlin> g = TinkerFactory.createModern().traversal() ==>graphtraversalsource[tinkergraph [vertices:6 edges:6] , standard] gremlin> g.V(1).as("a").both().both().has("age").where(neq("a")).by("age") The property does not exist as the key has no associated value for the provided element: v [5] :age Type ':help' or ':h' for help. Display stack trace? [yN] N gremlin> g.V(1).as("a").both().both().has("age").where(neq("a")).by("age").explain() ==>Traversal Explanation =========================================================================================================================================================================================================================================== Original Traversal [GraphStep(vertex, [1] )@ [a] , VertexStep(BOTH,vertex), VertexStep(BOTH,vertex), TraversalFilterStep([PropertiesStep( [age] ,value)]), WherePredicateStep(neq(a), [value(age)] )] ... FilterRankingStrategy [O] [GraphStep(vertex, [1] )@ [a] , VertexStep(BOTH,vertex), VertexStep(BOTH,vertex), WherePredicateStep(neq(a), [value(age)] ), TraversalFilterStep([PropertiesStep( [age] ,value)])] ... Final Traversal [TinkerGraphStep(vertex, [1] )@ [a] , VertexStep(BOTH,vertex), VertexStep(BOTH,vertex), WherePredicateStep(neq(a), [value(age)] ), NoOpBarrierStep(2500), TraversalFilterStep([PropertiesStep( [age] ,property)])] ```
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user dkuppitz commented on the issue:

          https://github.com/apache/tinkerpop/pull/495

          VOTE: +1

          Show
          githubbot ASF GitHub Bot added a comment - Github user dkuppitz commented on the issue: https://github.com/apache/tinkerpop/pull/495 VOTE: +1
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user spmallette commented on the issue:

          https://github.com/apache/tinkerpop/pull/495

          VOTE +1

          Show
          githubbot ASF GitHub Bot added a comment - Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/495 VOTE +1
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/tinkerpop/pull/495

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tinkerpop/pull/495

            People

            • Assignee:
              okram Marko A. Rodriguez
              Reporter:
              okram Marko A. Rodriguez
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development