Affects Version/s: 0.20.205.0, 0.23.0
When running a streaming job with mapper
bin/hadoop --config /etc/hadoop/ jar contrib/streaming/hadoop-streaming-0.20.205.0.jar -mapper "hadoop --config /etc/hadoop/ dfs -help" -reducer NONE -input "/tmp/input.txt" -output NONE
Task log shows:
Upon inspection, there are two problems in the inherited from environment which prevent the logger initialization to work properly. In hadoop-env.sh, the HADOOP_OPTS is inherited from the parent process. This configuration was requested by user to have a way to override HADOOP environment in the configuration template:
-Dhadoop.log.dir=$HADOOP_LOG_DIR/task_tracker_user is injected into HADOOP_OPTS in the tasktracker environment. Hence, the running task would inherit the wrong logging directory, which the end user might not have sufficient access to write. Second, $HADOOP_ROOT_LOGGER is override to: -Dhadoop.root.logger=INFO,TLA by the task controller, therefore, the bin/hadoop script will attempt to use hadoop.root.logger=INFO,TLA, but fail to initialize.
|Status||Resolved [ 5 ]||Closed [ 6 ]|
|Status||Patch Available [ 10002 ]||Resolved [ 5 ]|
|Resolution||Fixed [ 1 ]|
|Status||Open [ 1 ]||Patch Available [ 10002 ]|
Removed inheritance of certain server environment variables (HADOOP_OPTS and HADOOP_ROOT_LOGGER) in task attempt process.
|Affects Version/s||0.23.0 [ 12315570 ]|
|Fix Version/s||0.23.0 [ 12315570 ]|