Details
-
Improvement
-
Status: Closed
-
Trivial
-
Resolution: Fixed
-
0.23.0
-
None
Description
Steaming docs say, at: http://hadoop.apache.org/common/docs/current/streaming.html#How+do+I+process+files%2C+one+per+map%3F
"Generate a file containing the full HDFS path of the input files. Each map task would get one file name as input."
This is incorrect, as a file isn't split by lines, rather by size - for MR.