Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6009

Map-only job with new-api runs wrong OutputCommitter when cleanup scheduled in a reduce slot

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.2.1
    • 1.3.0, 1.2.2
    • client, job submission
    • None

    Description

      In branch 1 job commit is executed in a JOB_CLEANUP task that may run in either map or reduce slot

      in org.apache.hadoop.mapreduce.Job#setUseNewAPI there is a logic setting new-api flag only for reduce-ful jobs.

          if (numReduces != 0) {
            conf.setBooleanIfUnset("mapred.reducer.new-api",
                                   conf.get(oldReduceClass) == null);
            ...
      

      Therefore, when cleanup runs in a reduce slot, ReduceTask inits using the old API and runs incorrect default OutputCommitter, instead of consulting OutputFormat.

      Attachments

        1. MAPREDUCE-6009.v01-branch-1.2.patch
          2 kB
          Gera Shegalov
        2. MAPREDUCE-6009.v02-branch-1.2.patch
          0.9 kB
          Gera Shegalov

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jira.shegalov Gera Shegalov
            jira.shegalov Gera Shegalov
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment