Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1785

Add streaming config option for not emitting the key

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.22.0
    • Component/s: contrib/streaming
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Added a configuration property "stream.map.input.ignoreKey" to specify whether to ignore key or not while writing input for the mapper. This configuration parameter is valid only if stream.map.input.writer.class is org.apache.hadoop.streaming.io.TextInputWriter.class. For all other InputWriter's, key is always written.
      Show
      Added a configuration property "stream.map.input.ignoreKey" to specify whether to ignore key or not while writing input for the mapper. This configuration parameter is valid only if stream.map.input.writer.class is org.apache.hadoop.streaming.io.TextInputWriter.class. For all other InputWriter's, key is always written.

      Description

      PipeMapper currently does not emit the key when using TextInputFormat. If you switch to input formats (eg LzoTextInputFormat) the key will be emitted. We should add an option so users can explicitly make streaming not emit the key so they can change input formats without breaking or having to modify their existing programs.

        Activity

        Eli Collins created issue -
        Eli Collins made changes -
        Field Original Value New Value
        Attachment mapreduce-1785-1.patch [ 12444445 ]
        Eli Collins made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Sharad Agarwal made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Resolution Fixed [ 1 ]
        Amareshwari Sriramadasu made changes -
        Release Note Added a configuration property "stream.map.input.ignoreKey" to specify whether to ignore key or not while reading input.
        Amareshwari Sriramadasu made changes -
        Release Note Added a configuration property "stream.map.input.ignoreKey" to specify whether to ignore key or not while reading input. Added a configuration property "stream.map.input.ignoreKey" to specify whether to ignore key or not while writing input for the mapper. This configuration parameter is valid only if stream.map.input.writer.class is org.apache.hadoop.streaming.io.TextInputWriter.class. For all other InputWriter's, key is always written.

          People

          • Assignee:
            Eli Collins
            Reporter:
            Eli Collins
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development