Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-686

job.setOutputValueComparatorClass(theClass) should be supported

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • 0.13.0
    • None
    • None
    • all environment

    Description

      if the input of Reduce phase is :

      K2, V3
      K2, V2
      K1, V5
      K1, V3
      K1, V4

      in the current hadoop, the reduce output could be:
      K1, (V5, V3, V4)
      K2, (V3, V2)

      But I hope hadoop supports job.setOutputValueComparatorClass(theClass), so that i can make values are in order, and the output could be:
      K1, (V3, V4, V5)
      K2, (V2, V3)

      This feature is very important, I think. Without it, we have to take the sorting by ourselves, and have to worry about the possibility that the values are too large to fit into memory. Then the codes becomes too hard to read. That is the reason why i think this feature is so important, and should be done in the hadoop framework.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              fjiang Feng Jiang
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: