Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-1585

OLAP dedup over non elements

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.2.3
    • Fix Version/s: 3.2.4
    • Component/s: hadoop, process
    • Labels:
      None

      Description

      OLAP dedup() is highly inefficient when it's fed with non elements.

      In a customer project a query similar tho the following returned a result in slightly more than 6 seconds:

      persistedRDD.
        V().hasLabel("label1","label2").
        inE("edgeLabel1","edgeLabel2").outV().
        id().count()
      

      The same query with dedup() added:

      persistedRDD.
        V().hasLabel("label1","label2").
        inE("edgeLabel1","edgeLabel2").outV().
        id().dedup().count()
      

      ...took more than 120 seconds.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                okram Marko A. Rodriguez
                Reporter:
                dkuppitz Daniel Kuppitz
              • Votes:
                2 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: