Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-1064

Standalone Samza with Zookeeper for Coordination

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 0.13.0
    • None
    • None

    Description

      In this use-case, we propose using Zookeeper for coordinating processor liveness and task distribution in a Samza job. Additionally, it opens up the possibility of allowing a flexible number of participating processors (no fixed container count).

      Attachments

        1. image_0.png
          5 kB
          Navina Ramesh
        2. image_1.png
          62 kB
          Navina Ramesh
        3. Samza Standalone-0.md
          30 kB
          Navina Ramesh
        4. Samza Standalone-0.pdf
          322 kB
          Navina Ramesh
        5. SamzaStandalone1.md
          30 kB
          Boris Shkolnik
        6. SamzaStandalone1.pdf
          75 kB
          Boris Shkolnik

        Issue Links

        1.
        Implement Leader Election using ZK Sub-task Resolved Navina Ramesh Actions
        2.
        Implement Simple grouper by container, with support for arbitrary container ids Sub-task Resolved Boris Shkolnik Actions
        3.
        Implement debounce scheduler Sub-task Resolved Boris Shkolnik Actions
        4.
        ZkController for communication with the ZK Sub-task Resolved Boris Shkolnik Actions
        5.
        Barrier for job model update Sub-task Resolved Boris Shkolnik Actions
        6.
        Create test utils for Zk based Standalone testing Sub-task Resolved Boris Shkolnik Actions
        7.
        add utils for publishing job model and job model version. Sub-task Resolved Boris Shkolnik Actions
        8.
        Implement startup and shutdown sequence of jobs in ZK environment Sub-task Open Navina Ramesh Actions
        9.
        create ZK based JobCoordinator Sub-task Resolved Boris Shkolnik Actions
        10.
        add a time out for ZkVersionUpgradeBarrier Sub-task Resolved Boris Shkolnik Actions
        11.
        Semantics of ProcessorId in Samza Sub-task Resolved Navina Ramesh Actions
        12.
        verify that SamzaContainerController.stop shuts down the container completely. Sub-task Resolved Navina Ramesh Actions
        13.
        simplify ZK barrier for version upgrade. Sub-task Resolved Boris Shkolnik Actions
        14.
        create ZkCoordination service Sub-task Resolved Boris Shkolnik Actions
        15.
        List integration tests needed for the StandAlone project Sub-task Resolved Boris Shkolnik Actions
        16.
        Metrics should be added for ZK based JobCoordinator Sub-task In Progress Navina Ramesh Actions
        17.
        Review tryBecomeLeader implementation to see if it can be simplified. Sub-task Open Unassigned Actions
        18.
        ZK cleanup Sub-task Closed Boris Shkolnik Actions
        19.
        Locality should be used in standalone execution (host-affinity) Sub-task Open Navina Ramesh Actions
        20.
        Fix ZK path issues + container IDs generation. Sub-task Resolved Boris Shkolnik Actions
        21.
        ZkController should close zk connection on stop() Sub-task Resolved Boris Shkolnik Actions
        22.
        add new StandAlone configs to the configuration doc Sub-task Closed Boris Shkolnik Actions
        23.
        if ZkConnect string contains extra path, it needs to be created on the ZK. Sub-task Closed Boris Shkolnik Actions
        24.
        update samza configs with StandAlone tutorial link, when it is available. Sub-task Resolved Boris Shkolnik Actions
        25.
        HappyPathTesting. Single and multiple processors happy path. Sub-task Resolved Boris Shkolnik Actions
        26.
        HappyPath Testing. Rolling Upgrades. Sub-task Open Shanthoosh Venkataraman Actions
        27.
        Error Testing. ZkUnavailable. Sub-task Resolved Boris Shkolnik Actions
        28.
        Error Testing. Failing processor. Sub-task Resolved Boris Shkolnik Actions
        29.
        Failure Testing. A processor dies while waiting for a barrier to complete. Sub-task Open Shanthoosh Venkataraman Actions
        30.
        add a config for de-bounce time. Sub-task Resolved Boris Shkolnik Actions
        31.
        If LocalAppRunner is used - use ZkJobCoordinator by default. Sub-task Resolved Unassigned Actions
        32.
        HappyPath Testing. Using local storage. Sub-task Open Unassigned Actions
        33.
        renam StandaloneJobCoordinator to PassthroughJobCoordinator Sub-task Resolved Boris Shkolnik Actions
        34.
        Stand alone integration tests. Sub-task Resolved Shanthoosh Venkataraman Actions
        35.
        HappyPathTesting. Verify JM generated only once for multiple added SP Sub-task Open Shanthoosh Venkataraman Actions
        36.
        ERror Testing. If all processor die, LAR should shutdown. Sub-task Open Shanthoosh Venkataraman Actions
        37.
        session expiration propagation Sub-task Resolved Boris Shkolnik Actions
        38.
        LocalApplicationRunner needs to support StreamTask Sub-task Resolved Boris Shkolnik Actions

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            navina Navina Ramesh
            navina Navina Ramesh

            Dates

              Created:
              Updated:

              Slack

                Issue deployment