Thanks for the review.
How about JobStateInternal, TaskStateInternal and TaskAttemptInternal?
Yep. The current names are terrible. Was trying to keep them short considering their extensive use in state machine definition.
Also, Job,Task and TaskAttempt (internal) interfaces don't need to expose the (external) JobState/TaskState etc.
Unfortunately, the history server is tied to these interfaces. Not exposing the external state, would mean the history server may not be usable across minor AM changes.
Eventually, we could change the history server to rely on hadoop-mapreduce-client-api - instead of a specific app, and ensure that no other module depends on the Job,Task,TaskAttempt interfaces. For now though, I think we need to keep the external getState(). Will try getting rid of the internal getState() though - since that should not be used anywhere outside of the hadoop-mapreduce-client-app module.
Interesting question is about the 0.23 releases and their compatibility
I don't see the point of retaining proto field numbers if this will change. The options I see are
1) Don't change the hadoop-mapreduce-client-api states. Would allow a .23 client to talk to a 2.0 AM.
2) Define states in hadoop-mapreduce-client-api which are useful to the client. 0.23 clients may fail when talking to 2.0 AMs. I'd prefer this option - since I don't think the APIs are finalized yet, MRv1 APIs are being touched, and there's several states like KILL_CONTAINER_CLEANUP, KILL_TASK_CLEANUP - which don't make a lot of sense for clients.