Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6837

Add an equivalent to Crunch's Pair class

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • mrv2

    Description

      Crunch has this great Pair class (https://crunch.apache.org/apidocs/0.14.0/org/apache/crunch/Pair.html) that saves one from constantly implementing composite writables. It seems silly that we still don't have an equivalent in MR.

      I would like to see a new class with the following API:

      package org.apache.hadoop.io;
      
      public class CompositeWritable<P extends WritableComparable, S extends WritableComparable> implements WritableComparable<CompositeWritable> {
        public CompositeWritable(P primary, S secondary);
        public P getPrimary();
        public void setPrimary(P primary);
        public S getSecondary();
        public void setSecondary(S secondary);
      
        // Return true if both primaries and both secondaries are equal
        public boolean equals(CompositeWritable o);
      
        // Return the primary's hash code
        public long hashCode();
      
        // Sort first by primary and then by secondary
        public int compareTo(CompositeWritable o);
      
        public void readFields(DataInput in);
        public void write(DataOutput out);
      }
      

      With such a class, implementing a secondary sort would mean just implementing a custom grouping comparator. That comparator could also be implemented as part of this JIRA:

      package org.apache.hadoop.io;
      
      public class CompositeGroupingComparator extends WritableComparator {
        ...
      }
      

      Or some such.

      Crunch also provides Tuple3, Tuple4, and TupleN classes, but I don't think we need to add equivalents. If someone really wants that capability, they can nest composite keys.

      Don't forget to add unit tests!

      Attachments

        1. MAPREDUCE-6837.001.patch
          9 kB
          Gézapeti
        2. MAPREDUCE-6837.002.patch
          9 kB
          Gézapeti

        Activity

          People

            gezapeti Gézapeti
            templedf Daniel Templeton
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: