Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-1541

Select should default to Pop.last semantics

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.2.3
    • Fix Version/s: 3.3.0
    • Component/s: process
    • Labels:

      Description

      Check this out:

      gremlin> g.V().as('a').select('a').as('a').select('a')
      ==>[v[1],v[1]]
      ==>[v[2],v[2]]
      ==>[v[3],v[3]]
      ==>[v[4],v[4]]
      ==>[v[5],v[5]]
      ==>[v[6],v[6]]
      

      Shouldn't we just return the uniques? This is a big decision as this can cause massive rippling breakage for users .

        Issue Links

          Activity

          Hide
          rdale Robert Dale added a comment -

          I would have expected a later label to overwrite any previous label of the same name, not add to a list. Other than this seeming to be a really odd thing to do, I was curious enough to do some digging.

          Select Step doesn't really seem to document this behavior other than this:

          NOTE: When executing a traversal with select() on a standard traversal engine (i.e. OLTP), select() will do its best to avoid calculating the path history and instead, will rely on a global data structure for storing the currently selected object.

          I'm not sure what the correct interpretation is. My interpretation is that only the 'current' object is selected minus path, or effectively, the last assignment overwrites any previous assignment. But it could also include the path history. It sounds a little wishy-washy so that leaves me wondering two things: 1) what exactly is the path information, and 2) how do I know when I'm going to get what? That is, when will I get the current object (and of what?) or the path or is the current object the path??

          Let's use examples.

          gremlin> g.V().as('a').select('a').as('a').select('a')
          ==>[v[0],v[0]]
          

          Above, it looks like we have a path. The first element is the first 'a' and the second is the second 'a'. Pretty simple.

          gremlin> g.V().as('a').select('a').as('a').select('a').as('a').select('a')
          ==>[v[0],v[0],[v[0],v[0]]]
          

          Above, again we have the first element is the first 'a' and the second is the second 'a'. But where it gets more interesting is the third element. This is the path result from the second step. So I think this reinforces the idea that if using the same label repeatedly, you are in fact building a path of paths. However, if it should be selecting the 'current object', and it can be a path, then I might expect to see only

          [v[0],v[0]]
          

          The other question might be is this just marking steps along a path instead of building a path? If so, then should this query be equivalent?

          gremlin> g.V().as('a').select('a').as('b').select('b').as('c').select('c').select('a','b','c')
          ==>[a:v[0],b:v[0],c:v[0]]
          

          So similarly I might expect:

          gremlin> g.V().as('a').select('a').as('a').select('a').as('a').select('a')
          ==>[v[0],v[0],v[0]]
          

          But if you're saying what I think you're saying, then what you're really saying is that while select is building path under the covers, it's not accessible and instead you get only the current object, which is not a path. Thus,

          gremlin> g.V().as('a').select('a').as('a').select('a').as('a').select('a')
          ==>v[0]
          

          Sorry, didn't answer your question. Haven't had much sleep. HTH.

          Show
          rdale Robert Dale added a comment - I would have expected a later label to overwrite any previous label of the same name, not add to a list. Other than this seeming to be a really odd thing to do, I was curious enough to do some digging. Select Step doesn't really seem to document this behavior other than this: NOTE: When executing a traversal with select() on a standard traversal engine (i.e. OLTP), select() will do its best to avoid calculating the path history and instead, will rely on a global data structure for storing the currently selected object. I'm not sure what the correct interpretation is. My interpretation is that only the 'current' object is selected minus path, or effectively, the last assignment overwrites any previous assignment. But it could also include the path history. It sounds a little wishy-washy so that leaves me wondering two things: 1) what exactly is the path information, and 2) how do I know when I'm going to get what? That is, when will I get the current object (and of what?) or the path or is the current object the path?? Let's use examples. gremlin> g.V().as('a').select('a').as('a').select('a') ==>[v[0],v[0]] Above, it looks like we have a path. The first element is the first 'a' and the second is the second 'a'. Pretty simple. gremlin> g.V().as('a').select('a').as('a').select('a').as('a').select('a') ==>[v[0],v[0],[v[0],v[0]]] Above, again we have the first element is the first 'a' and the second is the second 'a'. But where it gets more interesting is the third element. This is the path result from the second step. So I think this reinforces the idea that if using the same label repeatedly, you are in fact building a path of paths. However, if it should be selecting the 'current object', and it can be a path, then I might expect to see only [v[0],v[0]] The other question might be is this just marking steps along a path instead of building a path? If so, then should this query be equivalent? gremlin> g.V().as('a').select('a').as('b').select('b').as('c').select('c').select('a','b','c') ==>[a:v[0],b:v[0],c:v[0]] So similarly I might expect: gremlin> g.V().as('a').select('a').as('a').select('a').as('a').select('a') ==>[v[0],v[0],v[0]] But if you're saying what I think you're saying, then what you're really saying is that while select is building path under the covers, it's not accessible and instead you get only the current object, which is not a path. Thus, gremlin> g.V().as('a').select('a').as('a').select('a').as('a').select('a') ==>v[0] Sorry, didn't answer your question. Haven't had much sleep. HTH.
          Hide
          okram Marko A. Rodriguez added a comment -

          The reason we allow for the same path label to be used is because of:

          g.V().repeat(out().as('a')).times(3).select('a')
          

          Given that you can't change the label in the repeat(, you have it multiple times. Note that there is Pop where you can Pop.last, Pop.first, or Pop.all. Right now the default for most PathProcessor steps is Pop.all. If we could change that to Pop.last, then what we could do is simply analyze the traversal and compile time and unless there is a different Pop somewhere, then only keep the last version of the labeled step. That would be best. The problem – it would be a major breaking change.

          Show
          okram Marko A. Rodriguez added a comment - The reason we allow for the same path label to be used is because of: g.V().repeat(out().as('a')).times(3).select('a') Given that you can't change the label in the repeat( , you have it multiple times. Note that there is Pop where you can Pop.last , Pop.first , or Pop.all . Right now the default for most PathProcessor steps is Pop.all . If we could change that to Pop.last , then what we could do is simply analyze the traversal and compile time and unless there is a different Pop somewhere, then only keep the last version of the labeled step. That would be best. The problem – it would be a major breaking change.
          Hide
          okram Marko A. Rodriguez added a comment -

          The only two steps that don't default to Pop.last are SelectOneStep and SelectStep. I forced the to Pop.last default and only 6 test cases failed (all contrived usages).

          Wondering if for 3.3.0, we move to Pop.last default all around and add selectV3d0() as the backwards compatibility update for users.

          I suspect 99% of uses want Pop.last and it will allow us to do numerous optimizations that we couldn't do before.

          Show
          okram Marko A. Rodriguez added a comment - The only two steps that don't default to Pop.last are SelectOneStep and SelectStep . I forced the to Pop.last default and only 6 test cases failed (all contrived usages). Wondering if for 3.3.0, we move to Pop.last default all around and add selectV3d0() as the backwards compatibility update for users. I suspect 99% of uses want Pop.last and it will allow us to do numerous optimizations that we couldn't do before.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user okram opened a pull request:

          https://github.com/apache/tinkerpop/pull/482

          TINKERPOP-1541: Select should default to Pop.last semantics

          https://issues.apache.org/jira/browse/TINKERPOP-1541

          All `Scoping` steps use `Pop.last` semantics (`where()`, `match()`, `dedup()`), except `select()` which uses `Pop.mixed`. This is a wart from TinkerPop2 days. I changed `select()` to default to `Pop.last` and provided a `selectV3d0()` backwards compatibility step (@spmallette – is that how we should name it? or is it `selectV3d2()` now?).

          The benefit of this is that we can now introduce a simpler `Path` data structure if a traversal is all `Path.last` (which is 99% of traversals out there). That is, if the traversal requirements is `LABELED_PATH` and all pops are `Pop.last`, then we don't need to store "historic" path elements and can have a `MapPath` which is backed by a `Map<String,Object>`. Less data and a simpler/faster equality test for bulking.

          @robertdale –

          ```
          gremlin> g.V(1).as('a').select(mixed,'a').as('a').select(mixed,'a')
          ==>[v[1],v[1]]
          gremlin> g.V(1).as('a').select('a').as('a').select('a')
          ==>v[1]
          ```

          VOTE +1.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/apache/tinkerpop TINKERPOP-1541

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/tinkerpop/pull/482.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #482


          commit 2d147bf8ef2c3f79a8015632c85a8a01a679f2fd
          Author: Marko A. Rodriguez <okrammarko@gmail.com>
          Date: 2016-11-08T14:01:19Z

          Pop.mixed introduced to replace using a null Pop argument. SelectOneStep and SelectStep now default to using Pop.last instead of Pop.mixed. All other Scoping steps already use Pop.last semantics. This change will allow us to introduce a simpler Path data structure for LABELED_PATH traversals with complete Pop.last semantics (99 percent of uses).


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user okram opened a pull request: https://github.com/apache/tinkerpop/pull/482 TINKERPOP-1541 : Select should default to Pop.last semantics https://issues.apache.org/jira/browse/TINKERPOP-1541 All `Scoping` steps use `Pop.last` semantics (`where()`, `match()`, `dedup()`), except `select()` which uses `Pop.mixed`. This is a wart from TinkerPop2 days. I changed `select()` to default to `Pop.last` and provided a `selectV3d0()` backwards compatibility step (@spmallette – is that how we should name it? or is it `selectV3d2()` now?). The benefit of this is that we can now introduce a simpler `Path` data structure if a traversal is all `Path.last` (which is 99% of traversals out there). That is, if the traversal requirements is `LABELED_PATH` and all pops are `Pop.last`, then we don't need to store "historic" path elements and can have a `MapPath` which is backed by a `Map<String,Object>`. Less data and a simpler/faster equality test for bulking. @robertdale – ``` gremlin> g.V(1).as('a').select(mixed,'a').as('a').select(mixed,'a') ==>[v [1] ,v [1] ] gremlin> g.V(1).as('a').select('a').as('a').select('a') ==>v [1] ``` VOTE +1. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/tinkerpop TINKERPOP-1541 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/482.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #482 commit 2d147bf8ef2c3f79a8015632c85a8a01a679f2fd Author: Marko A. Rodriguez <okrammarko@gmail.com> Date: 2016-11-08T14:01:19Z Pop.mixed introduced to replace using a null Pop argument. SelectOneStep and SelectStep now default to using Pop.last instead of Pop.mixed. All other Scoping steps already use Pop.last semantics. This change will allow us to introduce a simpler Path data structure for LABELED_PATH traversals with complete Pop.last semantics (99 percent of uses).
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user spmallette commented on the issue:

          https://github.com/apache/tinkerpop/pull/482

          It should be `selectV3d2()` given that it will yield backward compatibility to 3.2.x.

          Maybe I missed it, but are there any tests for `selectV3d2()` - at least something to exercise the old way of doing things?

          Show
          githubbot ASF GitHub Bot added a comment - Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/482 It should be `selectV3d2()` given that it will yield backward compatibility to 3.2.x. Maybe I missed it, but are there any tests for `selectV3d2()` - at least something to exercise the old way of doing things?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user okram commented on the issue:

          https://github.com/apache/tinkerpop/pull/482

          The only difference is `select(Pop.mix,...)` (that is `selectV3d2()`) and there are 6 or so tests that test it in `RangeTest` and `TailTest`. I can explicitly make them use the method `selectV3d2()`, but as you can see, its simply just a call to `select(Pop.mix,...)`. I could make a `SelectV3d2Test` and just copy those range tests over and call those methods explicitly, but seems pointless as unlike `groupV320()`, its not a new step with different semantics, its the same step with just a different argument.

          Show
          githubbot ASF GitHub Bot added a comment - Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/482 The only difference is `select(Pop.mix,...)` (that is `selectV3d2()`) and there are 6 or so tests that test it in `RangeTest` and `TailTest`. I can explicitly make them use the method `selectV3d2()`, but as you can see, its simply just a call to `select(Pop.mix,...)`. I could make a `SelectV3d2Test` and just copy those range tests over and call those methods explicitly, but seems pointless as unlike `groupV320()`, its not a new step with different semantics, its the same step with just a different argument.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user spmallette commented on the issue:

          https://github.com/apache/tinkerpop/pull/482

          ok - if it's trivial, then perhaps not worrying about.

          Show
          githubbot ASF GitHub Bot added a comment - Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/482 ok - if it's trivial, then perhaps not worrying about.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user okram commented on the issue:

          https://github.com/apache/tinkerpop/pull/482

          When I go to merge, I will change the method name to `selectV3d2()` so if you want to vote.... ... Also, I think we need @dkuppitz and @robertdale and @mhfrantz to have a look and vote too.

          Show
          githubbot ASF GitHub Bot added a comment - Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/482 When I go to merge, I will change the method name to `selectV3d2()` so if you want to vote.... ... Also, I think we need @dkuppitz and @robertdale and @mhfrantz to have a look and vote too.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user dkuppitz commented on the issue:

          https://github.com/apache/tinkerpop/pull/482

          I'm cool with that change. I still have to run a `docker/build.sh -t -i -n` though - will do so as soon as I have the resources available (unless someone else volunteers).

          Show
          githubbot ASF GitHub Bot added a comment - Github user dkuppitz commented on the issue: https://github.com/apache/tinkerpop/pull/482 I'm cool with that change. I still have to run a `docker/build.sh -t -i -n` though - will do so as soon as I have the resources available (unless someone else volunteers).
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user dkuppitz commented on the issue:

          https://github.com/apache/tinkerpop/pull/482

          `docker/build.sh -t -i -n` succeeded.

          VOTE: +1

          Show
          githubbot ASF GitHub Bot added a comment - Github user dkuppitz commented on the issue: https://github.com/apache/tinkerpop/pull/482 `docker/build.sh -t -i -n` succeeded. VOTE: +1
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user robertdale commented on the issue:

          https://github.com/apache/tinkerpop/pull/482

          VOTE: +1

          Show
          githubbot ASF GitHub Bot added a comment - Github user robertdale commented on the issue: https://github.com/apache/tinkerpop/pull/482 VOTE: +1
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user okram closed the pull request at:

          https://github.com/apache/tinkerpop/pull/482

          Show
          githubbot ASF GitHub Bot added a comment - Github user okram closed the pull request at: https://github.com/apache/tinkerpop/pull/482

            People

            • Assignee:
              okram Marko A. Rodriguez
              Reporter:
              okram Marko A. Rodriguez
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development