Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
-
Reviewed
-
Improved streaming job failure when #link is missing from uri format of -cacheArchive. Earlier it used to fail when launching individual tasks, now it fails during job submission itself.
Description
Ran hadoop streaming command as -:
bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input in -output out -mapper "xargs cat" -reducer "bin/cat" -cahceArchive hdfs://h/pathofJarFile
Streaming submits job to jobtracker and map fails.
For similar with -cacheFile -:
bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input in -output out -mapper "xargs cat" -reducer "bin/cat" -cahceFile hdfs://h/pathofFile
followinng error is repoerted back -:
[
You need to specify the uris as hdfs://host:port/#linkname,Please specify a different link name for all of your caching URIs
]
Streaming should check about present #link after uri of cacheArchive and should throw proper error .