Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-542

on-the-fly merge sort, HADOOP-540, reformat

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      Tested on Linux and Windows

      Description

      A large patch for streaming. Changes:

      Support for on-the-fly merge sort of multiple map input files.
      This supposes that the inputs are already sorted.

      Support for reducer-NONE side-effects to a single local output with DFS inputs.
      This can be used to do an on-the-fly merge-sort of remote sorted files.
      (Compare to DFSShell -getmerge which does catenation of remote sorted files)
      The single output can be a regular file, a named pipe or a socket.
      URI Syntax: -mapsideoutput file:/C:/win

      Add an optional JUnit test for on-the-fly merge-sort.
      It requires Unix tools. It also works with cygwin.

      If it has been more than 10 secs since last time we did this:
      call reporter.setStatus() when consuming a stderr line from the Application.
      Calling setStatus with reducer-NONE was already done as part of HADOOP-413.
      So overall this resolves HADOOP-540.

      Reformat streaming code to conform to Hadoop conventions
      (indent 2 spaces, opening bracket on same-line)

        Attachments

        1. bigmux.patch2
          195 kB
          Michel Tourn

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              michel_tourn Michel Tourn
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: