Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-1164

ReducingBarriersSteps should use ComputerMemory, not MapReduce.

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Implemented
    • Affects Version/s: 3.1.0-incubating
    • Fix Version/s: 3.2.0-incubating
    • Component/s: process
    • Labels:
      None

      Description

      This just hit me like a ton of bricks. Check this:

      g.V().count()
      

      That is:

      TraversalVertexProgram + CountMapReduce
      

      Thats stupid. Just use the "reduction" aspects of Memory. Replace CountMapReduce with:

      memory.incr("~reducing", traverser.bulk())
      

      Thats it. Likewise for all the other reducing barriers! Now, not only do we don't have to do a MapReduce job, we don't even have to break out of the TraversalVertexProgram (no more "No mid-barrier steps."). Why? Well, because its in memory, its computed on that iteration and then accessible!

      g.V().group().by('lang').select('java').values("name")
      

      That would be one TraversalVertexProgram!

      .....................why do we even have MapReduce.......................... is Memory all we really need. Crazy.............................................. thats craZy talk................................. but still, think about it.

        Issue Links

          Activity

          Hide
          okram Marko A. Rodriguez added a comment -

          I just tested this and it works. It was literally 5 lines of code I needed to change. Its "hardcoded" to only work for CountGlobalStep, but can easily be generalized to all ReducingBarrierSteps. This is insane. This will greatly speed up reducing steps and will allow for reducing steps to be mid-traversal. Here are the steps that benefit from this:

          sum()
          max()
          min()
          count()
          mean()
          group()
          groupCount()
          tree()
          fold()
          

          What is crazy is that besides aggregate() and store(), those are the only steps in Gremlin that have MapReduce implementations................................... this might be the argument for the death of MapReduce (in the future). For now, we can simply just change these steps to use Memory and its all backwards compatible and no one is the wiser.

          Show
          okram Marko A. Rodriguez added a comment - I just tested this and it works. It was literally 5 lines of code I needed to change. Its "hardcoded" to only work for CountGlobalStep , but can easily be generalized to all ReducingBarrierSteps . This is insane. This will greatly speed up reducing steps and will allow for reducing steps to be mid-traversal. Here are the steps that benefit from this: sum() max() min() count() mean() group() groupCount() tree() fold() What is crazy is that besides aggregate() and store() , those are the only steps in Gremlin that have MapReduce implementations................................... this might be the argument for the death of MapReduce (in the future). For now, we can simply just change these steps to use Memory and its all backwards compatible and no one is the wiser.

            People

            • Assignee:
              okram Marko A. Rodriguez
              Reporter:
              okram Marko A. Rodriguez
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development