The basic problem is the export of HADOOP_ENV_PROCESSED. This is used to prevent hadoop-env.sh from being re-read. This protection makes sense if you have, e.g., FOO_OPT="$FOO_OPTS blah" in this file. But with the current state of the world, there's quite a few things going wrong here:
1. We don't protect the other *-env.sh files like we do hadoop-env.sh. i.e., yarn-env.sh will gets re-read multiple times.
2. This only works correctly because the vars in hadoop-env.sh are also exported.... which is probably the wrong thing to do as well.
3. Now that there are more than just vars in *-env.sh, the system really does need to re-read *-env.sh to pick those up.
Some potential solutions worth exploring:
- What happens if we no longer export anything in *-env.sh and re-read them at every command run like we do with the other files? (This brings Hadoop in line with most other OS utilities.) One side effect: users upgrading to trunk would need to also remove the export lines as part of upgrading.
- If we protect all the *-env.sh and move user defined functions to a new file? This has a lot of potential! One big problem is what to do about YARN_OPTS.