Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-485

allow a different comparator for grouping keys in calls to reduce

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.5.0
    • 0.13.0
    • None
    • None

    Description

      Some algorithms require that the values to the reduce be sorted in a particular order, but extending the key with the additional fields causes them to be handled by different calls to reduce. (The user then collects the values until they detect a "real" key change and then processes them.)

      It would be much easier if the framework let you define a second comparator that did the grouping of values for reduces. So your reduce inputs look like:

      A1, V1
      A2, V2
      A3, V3
      B1, V4
      B2, V5

      instead of getting calls to reduce that look like:

      reduce(A1,

      {V1}

      ); reduce(A2,

      {V2}

      ); reduce(A3,

      {V3}

      ); reduce(B1,

      {V4}

      ); reduce(B2,

      {V5}

      );

      you could define the grouping comparator to just compare the letters and end up with:

      reduce(A1,

      {V1,V2,V3}

      ); reduce(B1,

      {V4,V5}

      );

      which is the desired outcome. Note that this assumes that the "extra" part of the key is just for sorting because the reduce will only see the first representative of each equivalence class.

      Attachments

        1. 485.patch
          8 kB
          Tahir Hashmi
        2. 485.patch
          8 kB
          Tahir Hashmi
        3. 485.patch
          8 kB
          Tahir Hashmi
        4. 485.patch
          8 kB
          Tahir Hashmi
        5. 485.patch
          7 kB
          Tahir Hashmi
        6. Hadoop-485-pre.patch
          3 kB
          Tahir Hashmi
        7. TestUserValueGrouping.java.patch
          5 kB
          Tahir Hashmi

        Issue Links

          Activity

            People

              tahir Tahir Hashmi
              omalley Owen O'Malley
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: