Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-2361

CompactingHashTable loses entries

    XMLWordPrintableJSON

    Details

      Description

      When running the simple Connected Components algorithm (currently in Gelly) on the twitter follower graph, with 1, 100 or 10000 iterations, I get the following error:

      Caused by: java.lang.Exception: Target vertex '657282846' does not exist!.
      at org.apache.flink.graph.spargel.VertexCentricIteration$VertexUpdateUdfSimpleVV.coGroup(VertexCentricIteration.java:300)
      at org.apache.flink.runtime.operators.CoGroupWithSolutionSetSecondDriver.run(CoGroupWithSolutionSetSecondDriver.java:220)
      at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
      at org.apache.flink.runtime.iterative.task.AbstractIterativePactTask.run(AbstractIterativePactTask.java:139)
      at org.apache.flink.runtime.iterative.task.IterationTailPactTask.run(IterationTailPactTask.java:107)
      at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
      at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
      at java.lang.Thread.run(Thread.java:722)

      Now this is very bizzare as the DataSet of vertices is produced from the DataSet of edges... Which means there cannot be a an edge with an invalid target id... The method calls flatMap to isolate the src and trg ids and distinct to ensure their uniqueness.

      The algorithm works fine for smaller data sets...

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                sewen Stephan Ewen
                Reporter:
                andralungu Andra Lungu
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: