Apache S4
  1. Apache S4
  2. S4-27

extensions to cluster configuration through Zookeeper

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.5.0
    • Labels:
      None

      Description

      Applications running on S4 clusters are configured through Zookeeper.

      We need to extend the current configuration properties in order to configure more features used/required by S4 (streams, SLAs, states etc...)

      Current configuration
      ----------------------------

      It is currently limited to:

      • assigning tasks to logical partitions (S4 nodes)
      • publishing applications, retrievable from remote repositories

      Available tasks, assigned tasks and applications are defined as znodes, and contain metadata (data associated with the node), as JSON data (see ZNRecord class)

      The resulting structure in Zookeeper is currently:
      1. tasks

      • /<cluster-name>/tasks for available tasks
      • /<cluster-name>/tasks/Task-0 for instance represents 1 logical task, and metadata contains the task id and the partition id
      • /<cluster-name>/process for tasks assigned to S4 nodes
      • /<cluster-name>/process/Task-0 is an ephemeral node created by an S4 node that took the Task-0 task. Metadata contains the hostname of that S4 node
        2. apps
      • /<cluster-name>/apps for applications
      • /<cluster-name>/apps/app1 for instance is the application "app1" running on the (logical) cluster and metadata contains just the URI for fetching the S4R archive with the application code

      What we need to add
      ----------------------------

      (just some starting points that can be seen as subtasks):

      1. nodes state: it would be really useful to have a general view on the available S4 nodes for a given logical cluster. In particular: what nodes are available, what is their state (initializing, ready, stopped, processing a task, in standby,?).
      --> we could use a new directory /<cluster-name>/nodes and metadata could contain information about the node, and notably its state
      --> the corresponding ephemeral znode would be maintained by the Server instance or a related entity

      2. streams: if we want to implement inter-app communication through streams, then streams should be configurable through Zookeeper.
      --> streams could appear in /<cluster-name>/streams

      • Metadata for streams could include partitioning scheme (as suggested by Kishore in S4-10).
      • Metadata could also include a key finder string
      • children nodes could list applications using the stream
        --> corresponding persistent znode would be created at application startup. If the stream znode already exists, it would be reused.

        Activity

        Hide
        Matthieu Morel added a comment -

        Updates for 0.5 included handling multiple subclusters, 1 app per subcluster, and inter-cluster communication through published streams. All the related znodes hierarchies have been incorporated in the Zookeeper cluster configuration.

        New related attributes or changes will be addressed in new tickets

        Show
        Matthieu Morel added a comment - Updates for 0.5 included handling multiple subclusters, 1 app per subcluster, and inter-cluster communication through published streams. All the related znodes hierarchies have been incorporated in the Zookeeper cluster configuration. New related attributes or changes will be addressed in new tickets
        Hide
        Matthieu Morel added a comment -

        Developments in S4-22 added the following concepts:

        Show
        Matthieu Morel added a comment - Developments in S4-22 added the following concepts: Multiple subsclusters in an S4 namespace, e.g.: /s4/clusters/cluster1 /s4/clusters/cluster2 Publication of stream producers and consumers, described here: https://issues.apache.org/jira/secure/attachment/12531387/Inter%20cluster%20communication%20in%20S4%20piper.pdf

          People

          • Assignee:
            Matthieu Morel
            Reporter:
            Matthieu Morel
          • Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development