Giraph
  1. Giraph
  2. GIRAPH-124

Combiner should return Iterable<M> instead of M or null.

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.1.0
    • Fix Version/s: 0.1.0
    • Component/s: graph
    • Labels:
      None

      Description

      Currently VertexCombiner is expected to return a single message combining the input messages, or null in case no message should be sent. The new expected interface should return an Iterable<M>, possibly empty. The number of elements in the returned Iterable is supposed to be smaller than the number of input messages, by the initial definition of a Combiner (defined as a function to reduce I/O by combining multiple messages into 1).

      1. GIRAPH-124.diff
        12 kB
        Claudio Martella
      2. GIRAPH-124.diff
        11 kB
        Claudio Martella

        Activity

        Hide
        Claudio Martella added a comment -

        As discussed in mailing-list, there are two possible approaches on how to manage a returned Iterable<M> with more messages than the input set: a warning in the logs or an exception. We should take a decision about it.

        Show
        Claudio Martella added a comment - As discussed in mailing-list, there are two possible approaches on how to manage a returned Iterable<M> with more messages than the input set: a warning in the logs or an exception. We should take a decision about it.
        Hide
        Claudio Martella added a comment -

        Implements the Iterable<M> interface on the return value. Throwing IllegalStateException when returned number of elements is >= the number of original messages.

        Show
        Claudio Martella added a comment - Implements the Iterable<M> interface on the return value. Throwing IllegalStateException when returned number of elements is >= the number of original messages.
        Hide
        Avery Ching added a comment -

        Nice, Claudio. I haven't had a chance to fully test it, but wanted to give you some early feedback.

        1) Some changes have messed up indenting a little (here are some examples)

        • public FloatWritable combine(LongWritable vertexIndex,
          + public Iterable<FloatWritable> combine(LongWritable vertexIndex,
          Iterable<FloatWritable> msgList)
        • public abstract M combine(I vertexIndex,
          + public abstract Iterable<M> combine(I vertexIndex,
          Iterable<M> messages) throws IOException;
        • M combinedMsg = combiner.combine(entry.getKey(),
          + Iterable<M> messages = combiner.combine(entry.getKey(),
          entry.getValue());
        • public IntWritable combine(LongWritable vertexIndex,
          + public Iterable<IntWritable> combine(LongWritable vertexIndex,
          Iterable<IntWritable> messages)

        2) Should we make the requirement that the returned result has a size < input size? I think the argument was that some classification of messages might not always reduce the number of messages? Perhaps <=?

        Show
        Avery Ching added a comment - Nice, Claudio. I haven't had a chance to fully test it, but wanted to give you some early feedback. 1) Some changes have messed up indenting a little (here are some examples) public FloatWritable combine(LongWritable vertexIndex, + public Iterable<FloatWritable> combine(LongWritable vertexIndex, Iterable<FloatWritable> msgList) public abstract M combine(I vertexIndex, + public abstract Iterable<M> combine(I vertexIndex, Iterable<M> messages) throws IOException; M combinedMsg = combiner.combine(entry.getKey(), + Iterable<M> messages = combiner.combine(entry.getKey(), entry.getValue()); public IntWritable combine(LongWritable vertexIndex, + public Iterable<IntWritable> combine(LongWritable vertexIndex, Iterable<IntWritable> messages) 2) Should we make the requirement that the returned result has a size < input size? I think the argument was that some classification of messages might not always reduce the number of messages? Perhaps <=?
        Hide
        Claudio Martella added a comment -

        1) Weird... I'll take a look at it.
        2) To be honest I didn't remember what we had agreed exactly, but <= sounds reasonable.

        Show
        Claudio Martella added a comment - 1) Weird... I'll take a look at it. 2) To be honest I didn't remember what we had agreed exactly, but <= sounds reasonable.
        Hide
        Claudio Martella added a comment -

        Fixes indentation and Exception messages according to Avery's comments.

        Show
        Claudio Martella added a comment - Fixes indentation and Exception messages according to Avery's comments.
        Hide
        Avery Ching added a comment -

        Thanks for the improvements, I have a few more comments. These comments are a little tricky on JIRA, do you want to try reviewboard next time?

         
        +   /**
        +    * Combines message values for a particular vertex index.
        +    *
        +    * @param vertexIndex Index of the vertex getting these messages
        +    * @param messages Iterable of the messages to be combined
        +    * @return Message that is combined from {@link messages} or null if no
        +    *         message it to be sent
        +    * @throws IOException
        +    */
        +    public abstract Iterable<M> combine(I vertexIndex,
        +            Iterable<M> messages) throws IOException;
        

        This javadoc is not quite right. @return should be corrected, for instance, null is not legal now right?

        +                        Iterable<M> messages = combiner.combine(entry.getKey(),
                                                                  entry.getValue());
        

        Can you align 'entry.getValue()'?

        +                    if (Iterables.size(outMessageList) < 
        +                            Iterables.size(messages)) {
        

        Before you do this check, we still need to do a check that messages is not null. If it is null, the combiner has violated the combine() contract. We need to throw an exception at this point.

        Also, have you run the unittests (local and MR)?

        Thanks!

        Show
        Avery Ching added a comment - Thanks for the improvements, I have a few more comments. These comments are a little tricky on JIRA, do you want to try reviewboard next time? + /** + * Combines message values for a particular vertex index. + * + * @param vertexIndex Index of the vertex getting these messages + * @param messages Iterable of the messages to be combined + * @return Message that is combined from {@link messages} or null if no + * message it to be sent + * @throws IOException + */ + public abstract Iterable<M> combine(I vertexIndex, + Iterable<M> messages) throws IOException; This javadoc is not quite right. @return should be corrected, for instance, null is not legal now right? + Iterable<M> messages = combiner.combine(entry.getKey(), entry.getValue()); Can you align 'entry.getValue()'? + if (Iterables.size(outMessageList) < + Iterables.size(messages)) { Before you do this check, we still need to do a check that messages is not null. If it is null, the combiner has violated the combine() contract. We need to throw an exception at this point. Also, have you run the unittests (local and MR)? Thanks!
        Hide
        Claudio Martella added a comment -

        Ok, fixed the javadoc and the null check. Yes, both unittests were run.

        See https://reviews.apache.org/r/3592/.

        Show
        Claudio Martella added a comment - Ok, fixed the javadoc and the null check. Yes, both unittests were run. See https://reviews.apache.org/r/3592/ .
        Hide
        Avery Ching added a comment -

        Claudio, the reviewboard link has no code unfortunately...

        Show
        Avery Ching added a comment - Claudio, the reviewboard link has no code unfortunately...
        Hide
        Claudio Martella added a comment -

        damn RB. it should be fixed now...

        Show
        Claudio Martella added a comment - damn RB. it should be fixed now...
        Hide
        Claudio Martella added a comment -

        Commited, thanks.

        Show
        Claudio Martella added a comment - Commited, thanks.
        Hide
        Hudson added a comment -

        Integrated in Giraph-trunk-Commit #65 (See https://builds.apache.org/job/Giraph-trunk-Commit/65/)
        GIRAPH-124: Combiner should return Iterable<M> instead of M or null.

        claudio : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1234997
        Files :

        • /incubator/giraph/trunk/CHANGELOG
        • /incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
        • /incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/MinimumIntCombiner.java
        • /incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/SimpleSumCombiner.java
        • /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/VertexCombiner.java
        • /incubator/giraph/trunk/src/test/java/org/apache/giraph/TestVertexTypes.java
        • /incubator/giraph/trunk/src/test/java/org/apache/giraph/examples/MinimumIntCombinerTest.java
        Show
        Hudson added a comment - Integrated in Giraph-trunk-Commit #65 (See https://builds.apache.org/job/Giraph-trunk-Commit/65/ ) GIRAPH-124 : Combiner should return Iterable<M> instead of M or null. claudio : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1234997 Files : /incubator/giraph/trunk/CHANGELOG /incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java /incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/MinimumIntCombiner.java /incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/SimpleSumCombiner.java /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/VertexCombiner.java /incubator/giraph/trunk/src/test/java/org/apache/giraph/TestVertexTypes.java /incubator/giraph/trunk/src/test/java/org/apache/giraph/examples/MinimumIntCombinerTest.java
        Hide
        Claudio Martella added a comment -

        Apparently I forgot to add EmptyIterable.java. Just fixed it. Resolving.

        Show
        Claudio Martella added a comment - Apparently I forgot to add EmptyIterable.java. Just fixed it. Resolving.

          People

          • Assignee:
            Unassigned
            Reporter:
            Claudio Martella
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development