Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3991

Streaming FAQ has some wrong instructions about input files splitting

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Trivial
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 2.0.0-alpha
    • Component/s: documentation
    • Labels:
      None

      Description

      Steaming docs say, at: http://hadoop.apache.org/common/docs/current/streaming.html#How+do+I+process+files%2C+one+per+map%3F

      "Generate a file containing the full HDFS path of the input files. Each map task would get one file name as input."

      This is incorrect, as a file isn't split by lines, rather by size - for MR.

        Attachments

        1. MAPREDUCE-3991.patch
          1 kB
          Harsh J
        2. MAPREDUCE-3991.patch
          1 kB
          Harsh J

          Activity

            People

            • Assignee:
              qwertymaniac Harsh J
              Reporter:
              qwertymaniac Harsh J
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: