Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-11595

Gelly's addEdge() in certain circumstances still includes duplicate vertices.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Not a Priority
    • Resolution: Won't Fix
    • 1.7.1
    • None
    • None
    • MacOS, intelliJ

    Description

      Assuming a base graph constructed by:

      ```

      public class GraphCorn {

      public static Graph<String, VertexLabel, EdgeLabel> gc;

      public GraphCorn(String filename) throws Exception {
      ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

      DataSet<Tuple6<String, String, String, String, String, String>> csvInput = env.readCsvFile(filename)
      .types(String.class, String.class, String.class, String.class, String.class, String.class);

      DataSet<Vertex<String, VertexLabel>> srcTuples = csvInput.project(0, 2)
      .map(new MapFunction<Tuple, Vertex<String, VertexLabel>>() {
      @Override
      public Vertex<String, VertexLabel> map(Tuple tuple) throws Exception

      { VertexLabel lb = new VertexLabel(Util.hash(tuple.getField(1))); return new Vertex<>(tuple.getField(0), lb); }
      }).returns(new TypeHint<Vertex<String, VertexLabel>>(){});

      DataSet<Vertex<String, VertexLabel>> dstTuples = csvInput.project(1, 3)
      .map(new MapFunction<Tuple, Vertex<String, VertexLabel>>() {
      @Override
      public Vertex<String, VertexLabel> map(Tuple tuple) throws Exception { VertexLabel lb = new VertexLabel(Util.hash(tuple.getField(1))); return new Vertex<>(tuple.getField(0), lb); }

      }).returns(new TypeHint<Vertex<String, VertexLabel>>(){});

      DataSet<Vertex<String, VertexLabel>> vertexTuples = srcTuples.union(dstTuples).distinct(0);

      DataSet<Edge<String, EdgeLabel>> edgeTuples = csvInput.project(0, 1, 4, 5)
      .map(new MapFunction<Tuple, Edge<String, EdgeLabel>>() {
      @Override
      public Edge<String, EdgeLabel> map(Tuple tuple) throws Exception

      { EdgeLabel lb = new EdgeLabel(Util.hash(tuple.getField(2)), Long.parseLong(tuple.getField(3))); return new Edge<>(tuple.getField(0), tuple.getField(1), lb); }

      }).returns(new TypeHint<Edge<String, EdgeLabel>>(){});

      this.gc = Graph.fromDataSet(vertexTuples, edgeTuples, env);
      }

      }

      ```

      Base graph CSV:

      ```

      0,1,a,b,c,0
      0,2,a,d,e,1
      1,2,b,d,f,2

      ```

      Attempt to add edges using the following function:

      ```

      try(BufferedReader br = new BufferedReader(new FileReader(this.fileName))) {
      for(String line; (line = br.readLine()) != null; )

      { String[] attributes = line.split(","); assert(attributes.length == 6); String srcID = attributes[0]; String dstID = attributes[1]; String srcLb = attributes[2]; String dstLb = attributes[3]; String edgeLb = attributes[4]; String ts = attributes[5]; Vertex<String, VertexLabel> src = new Vertex<>(srcID, new VertexLabel(Util.hash(srcLb))); Vertex<String, VertexLabel> dst = new Vertex<>(dstID, new VertexLabel(Util.hash(dstLb))); EdgeLabel edge = new EdgeLabel(Util.hash(edgeLb), Long.parseLong(ts)); GraphCorn.gc = GraphCorn.gc.addEdge(src, dst, edge); }

      } catch (Exception e)

      { System.err.println(e.getMessage()); }

      ```

      The graph components to add is:

      ```

      0,4,a,d,k,3
      1,3,b,a,g,3
      2,3,d,a,h,4

      ```

      GraphCorn.gc will contain duplicate node 0, 1, and 2 (those that exist in base graph), which should not be the case acceding to the documentation.

       

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            chan Calvin Han
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 2h
                2h
                Remaining:
                Remaining Estimate - 2h
                2h
                Logged:
                Time Spent - Not Specified
                Not Specified