Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3991

Streaming FAQ has some wrong instructions about input files splitting

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Trivial
    • Resolution: Fixed
    • 0.23.0
    • 2.0.0-alpha
    • documentation
    • None

    Description

      Steaming docs say, at: http://hadoop.apache.org/common/docs/current/streaming.html#How+do+I+process+files%2C+one+per+map%3F

      "Generate a file containing the full HDFS path of the input files. Each map task would get one file name as input."

      This is incorrect, as a file isn't split by lines, rather by size - for MR.

      Attachments

        1. MAPREDUCE-3991.patch
          1 kB
          Harsh J
        2. MAPREDUCE-3991.patch
          1 kB
          Harsh J

        Activity

          People

            qwertymaniac Harsh J
            qwertymaniac Harsh J
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: