Description
Execution of TezChild often ends with RejectedExecutionException (see below)
java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.hortonworks.tez.minicluster.InJvmContainerExecutor.doLaunch(InJvmContainerExecutor.java:178) at com.hortonworks.tez.minicluster.InJvmContainerExecutor.launchContainer(InJvmContainerExecutor.java:82) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.RejectedExecutionException: Task com.google.common.util.concurrent.ListenableFutureTask@20c5f562 rejected from java.util.concurrent.ThreadPoolExecutor@247105bd[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 3] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372) at com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:440) at com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:56) at org.apache.tez.runtime.task.TezChild.run(TezChild.java:177) at org.apache.tez.runtime.task.TezChild.main(TezChild.java:359) ... 12 more
This results in exit code 143 from the child process which seems to be OK and the overall Tez job completes successfully. That said, I think managing the lifecycle of the process through random exception is not appropriate especially when problem is easy to avoid.
I'll provide a patch shortly