Hama
  1. Hama
  2. HAMA-598

Destination of Edge should be set according to partitionID with or without runtime partitioning option.

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 0.6.0
    • Component/s: graph
    • Labels:
      None
    • Environment:

      Change Fix Version/s to 0.6.

      Description

      Otherwise, you'll see NullPointerExceptions.

      1. HAMA-598.patch
        7 kB
        Edward J. Yoon

        Activity

        Hide
        Edward J. Yoon added a comment -

        This patch fixes the problem.

        Show
        Edward J. Yoon added a comment - This patch fixes the problem.
        Hide
        Edward J. Yoon added a comment -

        Oh... I'll attach my new patch.

        Show
        Edward J. Yoon added a comment - Oh... I'll attach my new patch.
        Hide
        Edward J. Yoon added a comment -

        Quick commit for the test, but beautiful solve of this problem is requires big refactoring.

        Index: graph/src/main/java/org/apache/hama/graph/GraphJobRunner.java
        ===================================================================
        --- graph/src/main/java/org/apache/hama/graph/GraphJobRunner.java	(revision 1354305)
        +++ graph/src/main/java/org/apache/hama/graph/GraphJobRunner.java	(working copy)
        @@ -391,16 +391,16 @@
             Vertex<V, E, M> vertex = newVertexInstance(vertexClass, conf);
             vertex.setPeer(peer);
             vertex.runner = this;
        -    while (true) {
        -      KeyValuePair<Writable, Writable> next = peer.readNext();
        -      if (next == null) {
        -        break;
        -      }
        +    
        +    KeyValuePair<Writable, Writable> next = null;
        +    int lines = 0;
        +    while ((next = peer.readNext()) != null) {
               boolean vertexFinished = reader.parseVertex(next.getKey(),
                   next.getValue(), vertex);
               if (!vertexFinished) {
                 continue;
               }
        +      
               if (vertex.getEdges() == null) {
                 vertex.setEdges(new ArrayList<Edge<V, E>>(0));
               }
        @@ -420,12 +420,26 @@
                 }
                 peer.send(peer.getPeerName(partition), new GraphJobMessage(vertex));
               } else {
        +        // FIXME need to set destination names
                 vertex.setup(conf);
                 vertices.put(vertex.getVertexID(), vertex);
               }
               vertex = newVertexInstance(vertexClass, conf);
               vertex.setPeer(peer);
               vertex.runner = this;
        +      
        +      lines++;
        +      if((lines % 50000) == 0) {
        +        peer.sync();
        +        GraphJobMessage msg = null;
        +        while ((msg = peer.getCurrentMessage()) != null) {
        +          Vertex<V, E, M> messagedVertex = (Vertex<V, E, M>) msg.getVertex();
        +          messagedVertex.setPeer(peer);
        +          messagedVertex.runner = this;
        +          messagedVertex.setup(conf);
        +          vertices.put(messagedVertex.getVertexID(), messagedVertex);
        +        }
        +      }
             }
         
             if (runtimePartitioning) {
        @@ -440,6 +454,8 @@
               }
             }
         
        +    LOG.info("Loading finished at " + peer.getSuperstepCount() + " steps.");
        +    
             /*
              * If the user want to repair the graph, it should traverse through that
              * local chunk of adjancency list and message the corresponding peer to
        
        Show
        Edward J. Yoon added a comment - Quick commit for the test, but beautiful solve of this problem is requires big refactoring. Index: graph/src/main/java/org/apache/hama/graph/GraphJobRunner.java =================================================================== --- graph/src/main/java/org/apache/hama/graph/GraphJobRunner.java (revision 1354305) +++ graph/src/main/java/org/apache/hama/graph/GraphJobRunner.java (working copy) @@ -391,16 +391,16 @@ Vertex<V, E, M> vertex = newVertexInstance(vertexClass, conf); vertex.setPeer(peer); vertex.runner = this ; - while ( true ) { - KeyValuePair<Writable, Writable> next = peer.readNext(); - if (next == null ) { - break ; - } + + KeyValuePair<Writable, Writable> next = null ; + int lines = 0; + while ((next = peer.readNext()) != null ) { boolean vertexFinished = reader.parseVertex(next.getKey(), next.getValue(), vertex); if (!vertexFinished) { continue ; } + if (vertex.getEdges() == null ) { vertex.setEdges( new ArrayList<Edge<V, E>>(0)); } @@ -420,12 +420,26 @@ } peer.send(peer.getPeerName(partition), new GraphJobMessage(vertex)); } else { + // FIXME need to set destination names vertex.setup(conf); vertices.put(vertex.getVertexID(), vertex); } vertex = newVertexInstance(vertexClass, conf); vertex.setPeer(peer); vertex.runner = this ; + + lines++; + if ((lines % 50000) == 0) { + peer.sync(); + GraphJobMessage msg = null ; + while ((msg = peer.getCurrentMessage()) != null ) { + Vertex<V, E, M> messagedVertex = (Vertex<V, E, M>) msg.getVertex(); + messagedVertex.setPeer(peer); + messagedVertex.runner = this ; + messagedVertex.setup(conf); + vertices.put(messagedVertex.getVertexID(), messagedVertex); + } + } } if (runtimePartitioning) { @@ -440,6 +454,8 @@ } } + LOG.info( "Loading finished at " + peer.getSuperstepCount() + " steps." ); + /* * If the user want to repair the graph, it should traverse through that * local chunk of adjancency list and message the corresponding peer to
        Hide
        Edward J. Yoon added a comment -

        Graph example works fine on huge graph now as I expected.

        Show
        Edward J. Yoon added a comment - Graph example works fine on huge graph now as I expected.
        Hide
        Thomas Jungblut added a comment -

        In regards to memory usage, do you think it is neeeded to store the destination peer address on every vertex?
        Especially when it comes to fault tolerance where these values can change easily.

        Actually it would be better to calculate them on the fly based on the given partitioner.

        Show
        Thomas Jungblut added a comment - In regards to memory usage, do you think it is neeeded to store the destination peer address on every vertex? Especially when it comes to fault tolerance where these values can change easily. Actually it would be better to calculate them on the fly based on the given partitioner.
        Hide
        Edward J. Yoon added a comment -

        Actually it would be better to calculate them on the fly based on the given partitioner.

        Agree with you!

        Show
        Edward J. Yoon added a comment - Actually it would be better to calculate them on the fly based on the given partitioner. Agree with you!
        Hide
        Thomas Jungblut added a comment -

        Will be fixed in HAMA-596

        Show
        Thomas Jungblut added a comment - Will be fixed in HAMA-596

          People

          • Assignee:
            Edward J. Yoon
            Reporter:
            Edward J. Yoon
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development