Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.5.0-incubating
-
None
-
None
Description
The Apache Hadoop YARN has config property that allow restart on AM due to failures certain amount of times. It is specified by yarn.resourcemanager.am.max-attempts (default is 2).
So, this config parameter allow AM to have HA like behavior to resilient to failures up to certain number of times.
The Twill AppMaster seems to have problem restarting when failed due to uncertain condition (e.g.: kill signal)
<code>
<name>yarn.resourcemanager.am.max-attempts</name>
<value>2</value>
</property>
</code>