Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
2.0.3-alpha
-
None
-
Reviewed
Description
At least in the case of YarnChild, the application classloader is set fairly early (both in Configuration and as a TCCL). This has an effect of using the application classloader unexpectedly early.
There is a fair amount of code that gets invoked between setting the classloader and executing mapper/reducer task.
For example, I saw that the application classloader was asked to load a DOM parser class (com.sun.org.apache.xerces...) as part of initializing the filesystem. Luckily, in most cases this would be delegated to the parent classloader as the job classpath would not have those classes.
However, in general, this behavior carries the risk of loading classes with the app classloader accidentally, and potentially causing problems such as ClassCastException. Those would turn into nasty bugs that are hard to fix.
It would be good to either set the application classloader as late as possible or place clearer limitations so it loads only the mapper/reducer classes and their dependencies.