Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1535

Wrong comparator used to merge files in Reduce phase

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.12.3, 0.13.0
    • 0.14.0
    • None
    • None

    Description

      As per the fix for HADOOP-485, we allow users to optionally provide a different comparator to group values when calling the user's Reduce function. Devaraj and I were looking at the code yesterday and we found that in ReduceTask.java, we use the user-supplied comparator to merge the output files from the Map tasks (we use the user-supplied comparator when creating a new SequenceFile.Sorter object). This is incorrect as the comparator used to merge Map output files should be the same as that used to create those files in the Map phase. The user-supplied comparator for grouping values should be used only in the iterator passed to the user's Reduce function (which is done correctly in the code).

      Attachments

        1. 1535_01.patch
          2 kB
          Vivek Ratan
        2. 1535_02.patch
          24 kB
          Vivek Ratan

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vivekr Vivek Ratan
            vivekr Vivek Ratan
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment