Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-70

Create setup class to handle per-job startup setup

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 0.6.0
    • None
    • container
    • None

    Description

      There is some Samza job setup that happens before tasks can be run. This includes setting up the checkpoint and state management (change log) factories. For example, we want to verify that the change log and checkpoint topics exist, and if not, create them with the proper number of partitions.

      We should pull this logic into a SetupJob class, and move the execution into a new YarnAppMasterListener called SamzaAppMasterSetup, which should do the job setup during the init() call. In addition, we should execute the same SetupJob class logic in the ProcessJob.submit and ThreadJob.submit methods, as well.

      The motivation for this is threefold:

      1. There is a race condition in the TaskRunner when multiple containers for a single job are running in YARN, where each TaskRunner is trying to create the checkpoint/change log topics when they don't exist.

      2. It makes implementing the TaskRunner logic in other languages easier, since non-Java TaskRunner implementations won't have to set the topics up. The SetupClass will be handled in the AM (under YARN) or in the Java code of the ProcessJob/ThreadJob (under local job).

      3. It gives us a place to run a single chunk of code in controlled, single threaded way, before any of the TaskRunners start.

      Some things to consider: is it OK to just hard-code that the SetupJob class should always just setup the checkpoint manager and change log topics? Do we need to add a setup() method to the lifecycle for everything?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              criccomini Chris Riccomini
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: