Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-1064

Standalone Samza with Zookeeper for Coordination

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 0.13.0
    • None
    • None

    Description

      In this use-case, we propose using Zookeeper for coordinating processor liveness and task distribution in a Samza job. Additionally, it opens up the possibility of allowing a flexible number of participating processors (no fixed container count).

      Attachments

        1. image_0.png
          5 kB
          Navina Ramesh
        2. image_1.png
          62 kB
          Navina Ramesh
        3. Samza Standalone-0.md
          30 kB
          Navina Ramesh
        4. Samza Standalone-0.pdf
          322 kB
          Navina Ramesh
        5. SamzaStandalone1.md
          30 kB
          Boris Shkolnik
        6. SamzaStandalone1.pdf
          75 kB
          Boris Shkolnik

        Issue Links

          1.
          Implement Leader Election using ZK Sub-task Resolved Navina Ramesh
          2.
          Implement Simple grouper by container, with support for arbitrary container ids Sub-task Resolved Boris Shkolnik
          3.
          Implement debounce scheduler Sub-task Resolved Boris Shkolnik
          4.
          ZkController for communication with the ZK Sub-task Resolved Boris Shkolnik
          5.
          Barrier for job model update Sub-task Resolved Boris Shkolnik
          6.
          Create test utils for Zk based Standalone testing Sub-task Resolved Boris Shkolnik
          7.
          add utils for publishing job model and job model version. Sub-task Resolved Boris Shkolnik
          8.
          Implement startup and shutdown sequence of jobs in ZK environment Sub-task Open Navina Ramesh
          9.
          create ZK based JobCoordinator Sub-task Resolved Boris Shkolnik
          10.
          add a time out for ZkVersionUpgradeBarrier Sub-task Resolved Boris Shkolnik
          11.
          Semantics of ProcessorId in Samza Sub-task Resolved Navina Ramesh
          12.
          verify that SamzaContainerController.stop shuts down the container completely. Sub-task Resolved Navina Ramesh
          13.
          simplify ZK barrier for version upgrade. Sub-task Resolved Boris Shkolnik
          14.
          create ZkCoordination service Sub-task Resolved Boris Shkolnik
          15.
          List integration tests needed for the StandAlone project Sub-task Resolved Boris Shkolnik
          16.
          Metrics should be added for ZK based JobCoordinator Sub-task In Progress Navina Ramesh
          17.
          Review tryBecomeLeader implementation to see if it can be simplified. Sub-task Open Unassigned
          18.
          ZK cleanup Sub-task Closed Boris Shkolnik
          19.
          Locality should be used in standalone execution (host-affinity) Sub-task Open Navina Ramesh
          20.
          Fix ZK path issues + container IDs generation. Sub-task Resolved Boris Shkolnik
          21.
          ZkController should close zk connection on stop() Sub-task Resolved Boris Shkolnik
          22.
          add new StandAlone configs to the configuration doc Sub-task Closed Boris Shkolnik
          23.
          if ZkConnect string contains extra path, it needs to be created on the ZK. Sub-task Closed Boris Shkolnik
          24.
          update samza configs with StandAlone tutorial link, when it is available. Sub-task Resolved Boris Shkolnik
          25.
          HappyPathTesting. Single and multiple processors happy path. Sub-task Resolved Boris Shkolnik
          26.
          HappyPath Testing. Rolling Upgrades. Sub-task Open Shanthoosh Venkataraman
          27.
          Error Testing. ZkUnavailable. Sub-task Resolved Boris Shkolnik
          28.
          Error Testing. Failing processor. Sub-task Resolved Boris Shkolnik
          29.
          Failure Testing. A processor dies while waiting for a barrier to complete. Sub-task Open Shanthoosh Venkataraman
          30.
          add a config for de-bounce time. Sub-task Resolved Boris Shkolnik
          31.
          If LocalAppRunner is used - use ZkJobCoordinator by default. Sub-task Resolved Unassigned
          32.
          HappyPath Testing. Using local storage. Sub-task Open Unassigned
          33.
          renam StandaloneJobCoordinator to PassthroughJobCoordinator Sub-task Resolved Boris Shkolnik
          34.
          Stand alone integration tests. Sub-task Resolved Shanthoosh Venkataraman
          35.
          HappyPathTesting. Verify JM generated only once for multiple added SP Sub-task Open Shanthoosh Venkataraman
          36.
          ERror Testing. If all processor die, LAR should shutdown. Sub-task Open Shanthoosh Venkataraman
          37.
          session expiration propagation Sub-task Resolved Boris Shkolnik
          38.
          LocalApplicationRunner needs to support StreamTask Sub-task Resolved Boris Shkolnik

          Activity

            People

              navina Navina Ramesh
              navina Navina Ramesh
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: