Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Currently, after a Samza job is started, it works only on a set of fixed input topic partitions at the start-up time. When input topic partitions are expanded, we often lose the messages sent in the new partitions, until we restart the job.
SAMZA-882 added a input stream partition count monitor inside the JobCoordinator. This ticket is targeted to use this monitor metrics and trigger the following actions in YARN:
- for stateless jobs, shutdown the JobCoordinator w/ UNDEFINED status code s.t. YARN will restart the whole job
- for stateful jobs, shutdown the JobCoordinator w/ FAILED status code s.t. YARN will stop the whole job