Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
None
-
None
-
None
-
None
Description
Today if a job tracker becomes unresponsive or dies, the hadoop JobClient throws an exception subclass of IOException and exits with an exit code of 1. However, it would probably fail with the same exit code if there's any other type of exception as well. Programs like HOD which use this client (indirectly through the hadoop script) can make better decisions if the error code is more distinguishable. For e.g. if it's a network related exception, we can treat the cluster are unusable, or retry after awhile etc. More generically, if categories of exceptions can be treated with specific exit codes, it will help.
Comments ?