Apache S4
  1. Apache S4
  2. S4-22

Adaptor + inter app communication

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.5.0
    • Labels:
      None

      Description

      Need an adaptor for v0.5

      Idea I posted earlier:

      What do you think of this idea for a simple adaptor:

      • Adaptor extends App
      • Adaptor can send events but not receive (for now)
      • Adaptor is deployed as a regular App to the S4 cluster and as an
        Adaptor type in a host (separate from the S4 cluster).
      • Adaptor, unlike regular apps, can accept event data (in any format)
        directly, not via comm layer.
      • Input data is transformed into S4 events using a modular approach
        and by providing standard modules such as JSON.
      • Output events are exposed using EventSource and consumed by other
        apps without even knowing that they are Adaptors (only the App type is
        exposed in the cluster).
      • S4 events can be processed locally using PEs and Streams as usual.
        (We kind of need to get a local Sender for the local PEs and a
        standard cluster Sender for the EventSource object.)

      So why this approach?

      The GOOD:

      • Seems to be the least disruptive way to inject external events
      • Apps can easily consume the events in a modular way without any
        dependencies. Getting events from an adaptor or from another app is
        identical.
      • The adaptor would be packaged and deployed to the cluster as if it
        was an App (no incremental cost)
      • The adaptor can do preprocessing using the same programming model
        and can reuse PEs.

      The CHALLENGE:

      • We need to also deploy the Adaptor in a separate host. On the other
        hand, this is inevitable. At least we use the same approach instead of
        creating a different system.
      • The Adaptor will need to be integrated with ZK to get the physical addresses.
      • We need to deal with two senders.

      for later: two-way communication and adapter clusters.

      thoughts?

      1. s4-subclusters.pdf
        930 kB
        Leo Neumeyer
      2. Inter cluster communication in S4 piper.pdf
        1.10 MB
        Matthieu Morel
      There are no Sub-Tasks for this issue.

        Activity

        Hide
        Matthieu Morel added a comment -

        Leo also added +1 on the mailing list, so I proceeded and merged into the piper branch. Final commit is 121f33066dc4bf02bd37c618312c97874c6c73c4

        Show
        Matthieu Morel added a comment - Leo also added +1 on the mailing list, so I proceeded and merged into the piper branch. Final commit is 121f33066dc4bf02bd37c618312c97874c6c73c4
        Hide
        Karthik Kambatla added a comment -

        +1 Merge!

        Show
        Karthik Kambatla added a comment - +1 Merge!
        Hide
        Matthieu Morel added a comment -

        S4-22 branch now adresses the requirements for create adapters that can send events to S4 applications. Adapters are actually S4 applications, as suggested in the initial proposal. They send events to interested apps by using a pub/sub mechanism, described in https://issues.apache.org/jira/secure/attachment/12531387/Inter%20cluster%20communication%20in%20S4%20piper.pdf

        We also wrote a walkthrough that uses current code in S4-22 : https://cwiki.apache.org/confluence/display/S4/S4+piper+walkthrough

        At this point, S4-22 is much more advanced than the main piper branch (there were really a lot of changes involved), and applications can be built with it. I think we should move forward and merge this branch into the main piper branch.

        Once it is merged, any issue/improvement can then be adressed through new tickets, but at least we can move on.

        Please vote on that proposal!

        Show
        Matthieu Morel added a comment - S4-22 branch now adresses the requirements for create adapters that can send events to S4 applications. Adapters are actually S4 applications, as suggested in the initial proposal. They send events to interested apps by using a pub/sub mechanism, described in https://issues.apache.org/jira/secure/attachment/12531387/Inter%20cluster%20communication%20in%20S4%20piper.pdf We also wrote a walkthrough that uses current code in S4-22 : https://cwiki.apache.org/confluence/display/S4/S4+piper+walkthrough At this point, S4-22 is much more advanced than the main piper branch (there were really a lot of changes involved), and applications can be built with it. I think we should move forward and merge this branch into the main piper branch. Once it is merged, any issue/improvement can then be adressed through new tickets, but at least we can move on. Please vote on that proposal!
        Hide
        Matthieu Morel added a comment - - edited

        I just added a document describing the inter cluster communication as implemented in S4-22 branch

        Show
        Matthieu Morel added a comment - - edited I just added a document describing the inter cluster communication as implemented in S4-22 branch
        Hide
        Matthieu Morel added a comment -

        I have addressed the issue 2 mentioned above by using 2 kinds of Events, and by using different serializers with different classloaders:

        • at the application level one may use a specific Event class (subclass of Event)
        • at the communication level this is encapsulated into a EventMessage object

        updates are in branch S4-22 so that it will be easier to merge with piper, see commit http://git-wip-us.apache.org/repos/asf?p=incubator-s4.git;a=commit;h=f416374e

        Show
        Matthieu Morel added a comment - I have addressed the issue 2 mentioned above by using 2 kinds of Events, and by using different serializers with different classloaders: at the application level one may use a specific Event class (subclass of Event) at the communication level this is encapsulated into a EventMessage object updates are in branch S4-22 so that it will be easier to merge with piper, see commit http://git-wip-us.apache.org/repos/asf?p=incubator-s4.git;a=commit;h=f416374e
        Hide
        Matthieu Morel added a comment -

        I have taken a new approach for this problem, by porting the twitter example from S4 0.3, and using an adapter cluster and a processing cluster.

        I also created several scripts for launching zookeeper, uploading cluster configurations to it, packaging apps and deploying them. See "s4" command for that. (it relies on other scripts that are generated through gradle)

        See branch S4-22-rebased and readme files in twitter-adapter and twitter-counter projects in test-apps. Also note that tools only seem to work with gradle 1.0M6

        The adapter-to-app communication uses the concept of a "remote" stream, which actually sends events to the specified app cluster.

        2 conclusions:
        1/ we need to have pluggable partitioning schemes (in order to perform some balancing from the adapter to the app cluster for instance)
        2/ I found an important issue with the multiclassloader approach. Indeed, since events must be deserialized by the comm layer, the comm layer needs to be aware of the event classes. Which is not the case if we load application classes at the application level only. I'm not sure what path to take: removing the multiclassloader may lead to conflicts between packaged apps and the platform...
        Currently the twitter example uses a workaround: only generic events, but this is a strong limitation.

        Show
        Matthieu Morel added a comment - I have taken a new approach for this problem, by porting the twitter example from S4 0.3, and using an adapter cluster and a processing cluster. I also created several scripts for launching zookeeper, uploading cluster configurations to it, packaging apps and deploying them. See "s4" command for that. (it relies on other scripts that are generated through gradle) See branch S4-22 -rebased and readme files in twitter-adapter and twitter-counter projects in test-apps. Also note that tools only seem to work with gradle 1.0M6 The adapter-to-app communication uses the concept of a "remote" stream, which actually sends events to the specified app cluster. 2 conclusions: 1/ we need to have pluggable partitioning schemes (in order to perform some balancing from the adapter to the app cluster for instance) 2/ I found an important issue with the multiclassloader approach. Indeed, since events must be deserialized by the comm layer, the comm layer needs to be aware of the event classes. Which is not the case if we load application classes at the application level only. I'm not sure what path to take: removing the multiclassloader may lead to conflicts between packaged apps and the platform... Currently the twitter example uses a workaround: only generic events, but this is a strong limitation.
        Hide
        Matthieu Morel added a comment -

        Note that for this update I had to disable some of the fluent API code, for instance:
        https://git-wip-us.apache.org/repos/asf?p=incubator-s4.git;a=blob;f=subprojects/s4-core/src/main/java/org/apache/s4/fluent/AppMaker.java;h=436ca4c462e55ed1985583506385b607082a33bd;hb=403162eb91aa37a78cba646105463453f5de4d21#l166

        Since Leo is redesigning the fluent API, this code commenting most probably won't have any impact.

        Show
        Matthieu Morel added a comment - Note that for this update I had to disable some of the fluent API code, for instance: https://git-wip-us.apache.org/repos/asf?p=incubator-s4.git;a=blob;f=subprojects/s4-core/src/main/java/org/apache/s4/fluent/AppMaker.java;h=436ca4c462e55ed1985583506385b607082a33bd;hb=403162eb91aa37a78cba646105463453f5de4d21#l166 Since Leo is redesigning the fluent API, this code commenting most probably won't have any impact.
        Hide
        Matthieu Morel added a comment -

        In order to integrate Leo's work and easily test the approach, I copied and reorganized the code for the test applications in a new branch, and reused the mechanism for automatic deployment from S4-24 so that it is now possible to run the test automatically from a single place.

        see https://git-wip-us.apache.org/repos/asf?p=incubator-s4.git;a=log;h=refs/heads/S4-22

        One can deploy and run Leo's "showtime" and "counter" apps simply by running the "TestProducerConsumer" test in s4-core. Apps get built, packaged and deployed automatically.

        The approach for handling streams looks good, we still have to (see the TODO from the commit log) :

        • make the code generic in Server.java
        • update the fluent API
        • test on different logical clusters
        Show
        Matthieu Morel added a comment - In order to integrate Leo's work and easily test the approach, I copied and reorganized the code for the test applications in a new branch, and reused the mechanism for automatic deployment from S4-24 so that it is now possible to run the test automatically from a single place. see https://git-wip-us.apache.org/repos/asf?p=incubator-s4.git;a=log;h=refs/heads/S4-22 One can deploy and run Leo's "showtime" and "counter" apps simply by running the "TestProducerConsumer" test in s4-core. Apps get built, packaged and deployed automatically. The approach for handling streams looks good, we still have to (see the TODO from the commit log) : make the code generic in Server.java update the fluent API test on different logical clusters
        Hide
        Matthieu Morel added a comment -

        I argue that for App2 to receive events from App1, we want App2 to have access to a local reference of App1. (Do we agree on this? otherwise I may be missing something.)

        This constraint implies tight coupling between apps. Maybe your requirement for full symmetry comes from that (deploying all apps on all nodes for allowing them to share code?).

        We should rather strive for loose coupling between apps. Indeed, apps are already deployed through different classloaders, and they may even be deployed in different VMs (nodes).

        To enable loose coupling we need to remove dependencies to
        1. specialized event types defined in apps
        2. specialized KeyFinder implementations

        The most convenient way is probably to use a generic approach (only required for inter-app communications):
        1. generic event types for inter-app communications.
        2. a generic and parameterizable KeyFinder implementation (e.g. string keys)

        And my understanding is that Leo already undertook this approach in his prototype!

        Note that the KeyFinder parameters could be retrieved directly from the stream configuration stored in Zookeeper. This relates to S4-27 .

        Show
        Matthieu Morel added a comment - I argue that for App2 to receive events from App1, we want App2 to have access to a local reference of App1. (Do we agree on this? otherwise I may be missing something.) This constraint implies tight coupling between apps. Maybe your requirement for full symmetry comes from that (deploying all apps on all nodes for allowing them to share code?). We should rather strive for loose coupling between apps. Indeed, apps are already deployed through different classloaders, and they may even be deployed in different VMs (nodes). To enable loose coupling we need to remove dependencies to 1. specialized event types defined in apps 2. specialized KeyFinder implementations The most convenient way is probably to use a generic approach (only required for inter-app communications): 1. generic event types for inter-app communications. 2. a generic and parameterizable KeyFinder implementation (e.g. string keys) And my understanding is that Leo already undertook this approach in his prototype! Note that the KeyFinder parameters could be retrieved directly from the stream configuration stored in Zookeeper. This relates to S4-27 .
        Hide
        Leo Neumeyer added a comment -

        For the records, here is the outpur:

        comm.module=org.apache.s4.comm.Module
        s4.logger_level=TRACE
        s4.apps.path=/tmp/s4apps
        comm.queue_emmiter_size=8000
        comm.queue_listener_size=8000
        cluster.hosts=localhost
        cluster.ports=5077
        cluster.lock_dir=/tmp
        cluster.isCluster=true
        11:37:04.602 [main] INFO org.apache.s4.comm.topology.Cluster - Added cluster node: localhost:5077
        11:37:04.645 [main] INFO o.a.s.c.topology.AssignmentFromFile - Host Name: 10.0.1.97
        11:37:04.646 [main] INFO o.a.s.c.topology.AssignmentFromFile - Partition available: true
        11:37:04.666 [main] INFO o.a.s.c.topology.AssignmentFromFile - Partition acquired by PID:41322 HOST:Leo.local Lock File location: /tmp/s4-0.lock
        11:37:04.666 [main] INFO o.a.s.c.topology.AssignmentFromFile - Acquire partition:success.
        11:37:04.676 [main] INFO org.apache.s4.core.Server - Loading app: /tmp/s4apps/s4-counter-0.0.0-SNAPSHOT.s4r
        11:37:04.680 [main] INFO org.apache.s4.core.Server - App class name is: s4app.ClockApp
        11:37:04.680 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: s4app.ClockApp, resolveIt: true
        11:37:04.681 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [s4app.ClockApp]
        11:37:04.681 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.App, resolveIt: true
        11:37:04.683 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.App]
        11:37:04.683 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [s4app.ClockApp]
        11:37:04.684 [main] INFO org.apache.s4.core.Server - Loading app: /tmp/s4apps/s4-showtime-0.0.0-SNAPSHOT.s4r
        11:37:04.685 [main] INFO org.apache.s4.core.Server - App class name is: s4app.ShowTimeApp
        11:37:04.685 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: s4app.ShowTimeApp, resolveIt: true
        11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [s4app.ShowTimeApp]
        11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.App, resolveIt: true
        11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.App]
        11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [s4app.ShowTimeApp]
        11:37:04.686 [main] INFO org.apache.s4.core.Server - Starting app s4app.ClockApp
        11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.System, resolveIt: true
        11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.System]
        11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.io.PrintStream, resolveIt: true
        11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.io.PrintStream]
        Initing CounterApp...
        11:37:04.687 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: s4app.ClockPE, resolveIt: true
        11:37:04.687 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [s4app.ClockPE]
        11:37:04.687 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.ProcessingElement, resolveIt: true
        11:37:04.689 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.ProcessingElement]
        11:37:04.689 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [s4app.ClockPE]
        11:37:04.692 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.slf4j.LoggerFactory, resolveIt: true
        11:37:04.693 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.slf4j.LoggerFactory]
        11:37:04.695 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.Streamable, resolveIt: true
        11:37:04.695 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.Streamable]
        11:37:04.708 [main] DEBUG o.a.s.c.g.OverloadDispatcherGenerator - Dumping generated overload dispatcher class for PE of class [class s4app.ClockPE]
        11:37:04.714 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: OverloadDispatcher2841, resolveIt: true
        11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [OverloadDispatcher2841]
        11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.gen.OverloadDispatcher, resolveIt: true
        11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.gen.OverloadDispatcher]
        11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.Object, resolveIt: true
        11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.Object]
        11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [OverloadDispatcher2841]
        11:37:04.746 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.util.concurrent.TimeUnit, resolveIt: true
        11:37:04.746 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.util.concurrent.TimeUnit]
        11:37:04.747 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.EventSource, resolveIt: true
        11:37:04.749 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.EventSource]
        Starting CounterApp...
        11:37:04.751 [main] TRACE org.apache.s4.core.ProcessingElement - OnCreateInternal
        11:37:04.757 [main] TRACE org.apache.s4.core.ProcessingElement - Num PE instances: 0.
        11:37:04.757 [main] INFO org.apache.s4.core.Server - Starting app s4app.ShowTimeApp
        11:37:04.757 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.System, resolveIt: true
        11:37:04.757 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.System]
        11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.io.PrintStream, resolveIt: true
        11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.io.PrintStream]
        Initing ShowTimeApp...
        11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: s4app.ShowPE, resolveIt: true
        11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [s4app.ShowPE]
        11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.ProcessingElement, resolveIt: true
        11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.ProcessingElement]
        11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [s4app.ShowPE]
        11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.slf4j.LoggerFactory, resolveIt: true
        11:37:04.759 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.slf4j.LoggerFactory]
        11:37:04.759 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.base.Event, resolveIt: true
        11:37:04.759 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.base.Event]
        11:37:04.760 [main] DEBUG o.a.s.c.g.OverloadDispatcherGenerator - Dumping generated overload dispatcher class for PE of class [class s4app.ShowPE]
        11:37:04.761 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: OverloadDispatcher783, resolveIt: true
        11:37:04.761 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [OverloadDispatcher783]
        11:37:04.761 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.gen.OverloadDispatcher, resolveIt: true
        11:37:04.761 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.gen.OverloadDispatcher]
        11:37:04.762 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.Object, resolveIt: true
        11:37:04.762 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.Object]
        11:37:04.762 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [OverloadDispatcher783]
        Starting ShowTimeApp...
        11:37:04.763 [main] TRACE org.apache.s4.core.ProcessingElement - OnCreateInternal
        11:37:04.763 [main] TRACE org.apache.s4.core.ProcessingElement - Num PE instances: 0.
        11:37:04.763 [main] INFO org.apache.s4.core.Server - Resolving dependencies.
        11:37:04.763 [main] INFO org.apache.s4.core.Server - Resolving dependencies for s4app.ClockApp
        11:37:04.764 [main] INFO org.apache.s4.core.Server - App [s4app.ClockApp] exports event source [I can give you the time!].
        11:37:04.764 [main] INFO org.apache.s4.core.Server - Resolving dependencies for s4app.ShowTimeApp
        11:37:04.764 [main] INFO org.apache.s4.core.Server - The consumer app is [s4app.ShowTimeApp].
        11:37:04.764 [main] INFO org.apache.s4.core.Server - Subscribing stream [I need the time.] from app [s4app.ShowTimeApp] to event source.
        11:37:04.764 [main] INFO org.apache.s4.core.EventSource - Subscribing stream: I need the time. to event source: I can give you the time!.
        11:37:04.764 [main] INFO org.apache.s4.core.Server - Completed applications startup.
        11:37:05.749 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.base.Event, resolveIt: true
        11:37:05.749 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.base.Event]
        11:37:05.749 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.Long, resolveIt: true
        11:37:05.749 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.Long]
        11:37:05.752 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.slf4j.Logger, resolveIt: true
        11:37:05.752 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.slf4j.Logger]
        11:37:05.752 [Timer-0] INFO s4app.ClockPE - Sending event with tick 0 and time 1323545825749.
        11:37:05.761 [I need the time.] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.Long, resolveIt: true
        11:37:05.761 [I need the time.] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.Long]
        11:37:05.761 [I need the time.] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.slf4j.Logger, resolveIt: true
        11:37:05.761 [I need the time.] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.slf4j.Logger]
        11:37:05.761 [I need the time.] INFO s4app.ShowPE - Received event with tick 0 and time 1323545825749.
        11:37:06.749 [Timer-0] INFO s4app.ClockPE - Sending event with tick 1 and time 1323545826749.
        11:37:06.749 [I need the time.] INFO s4app.ShowPE - Received event with tick 1 and time 1323545826749.
        11:37:07.750 [Timer-0] INFO s4app.ClockPE - Sending event with tick 2 and time 1323545827750.
        11:37:07.750 [I need the time.] INFO s4app.ShowPE - Received event with tick 2 and time 1323545827750.
        11:37:08.751 [Timer-0] INFO s4app.ClockPE - Sending event with tick 3 and time 1323545828751.
        11:37:08.752 [I need the time.] INFO s4app.ShowPE - Received event with tick 3 and time 1323545828751.
        11:37:09.752 [Timer-0] INFO s4app.ClockPE - Sending event with tick 4 and time 1323545829752.
        11:37:09.752 [I need the time.] INFO s4app.ShowPE - Received event with tick 4 and time 1323545829752.
        11:37:10.753 [Timer-0] INFO s4app.ClockPE - Sending event with tick 5 and time 1323545830753.
        11:37:10.754 [I need the time.] INFO s4app.ShowPE - Received event with tick 5 and time 1323545830753.
        11:37:11.754 [Timer-0] INFO s4app.ClockPE - Sending event with tick 6 and time 1323545831754.
        11:37:11.754 [I need the time.] INFO s4app.ShowPE - Received event with tick 6 and time 1323545831754.
        11:37:12.755 [Timer-0] INFO s4app.ClockPE - Sending event with tick 7 and time 1323545832755.
        11:37:12.755 [I need the time.] INFO s4app.ShowPE - Received event with tick 7 and time 1323545832755.
        11:37:13.757 [Timer-0] INFO s4app.ClockPE - Sending event with tick 8 and time 1323545833757.
        11:37:13.757 [I need the time.] INFO s4app.ShowPE - Received event with tick 8 and time 1323545833757.
        11:37:14.756 [Timer-0] INFO s4app.ClockPE - Sending event with tick 9 and time 1323545834756.
        11:37:14.756 [I need the time.] INFO s4app.ShowPE - Received event with tick 9 and time 1323545834756.
        11:37:15.758 [Timer-0] INFO s4app.ClockPE - Sending event with tick 10 and time 1323545835757.
        11:37:15.758 [I need the time.] INFO s4app.ShowPE - Received event with tick 10 and time 1323545835757.

        Show
        Leo Neumeyer added a comment - For the records, here is the outpur: comm.module=org.apache.s4.comm.Module s4.logger_level=TRACE s4.apps.path=/tmp/s4apps comm.queue_emmiter_size=8000 comm.queue_listener_size=8000 cluster.hosts=localhost cluster.ports=5077 cluster.lock_dir=/tmp cluster.isCluster=true 11:37:04.602 [main] INFO org.apache.s4.comm.topology.Cluster - Added cluster node: localhost:5077 11:37:04.645 [main] INFO o.a.s.c.topology.AssignmentFromFile - Host Name: 10.0.1.97 11:37:04.646 [main] INFO o.a.s.c.topology.AssignmentFromFile - Partition available: true 11:37:04.666 [main] INFO o.a.s.c.topology.AssignmentFromFile - Partition acquired by PID:41322 HOST:Leo.local Lock File location: /tmp/s4-0.lock 11:37:04.666 [main] INFO o.a.s.c.topology.AssignmentFromFile - Acquire partition:success. 11:37:04.676 [main] INFO org.apache.s4.core.Server - Loading app: /tmp/s4apps/s4-counter-0.0.0-SNAPSHOT.s4r 11:37:04.680 [main] INFO org.apache.s4.core.Server - App class name is: s4app.ClockApp 11:37:04.680 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: s4app.ClockApp, resolveIt: true 11:37:04.681 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [s4app.ClockApp] 11:37:04.681 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.App, resolveIt: true 11:37:04.683 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.App] 11:37:04.683 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [s4app.ClockApp] 11:37:04.684 [main] INFO org.apache.s4.core.Server - Loading app: /tmp/s4apps/s4-showtime-0.0.0-SNAPSHOT.s4r 11:37:04.685 [main] INFO org.apache.s4.core.Server - App class name is: s4app.ShowTimeApp 11:37:04.685 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: s4app.ShowTimeApp, resolveIt: true 11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [s4app.ShowTimeApp] 11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.App, resolveIt: true 11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.App] 11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [s4app.ShowTimeApp] 11:37:04.686 [main] INFO org.apache.s4.core.Server - Starting app s4app.ClockApp 11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.System, resolveIt: true 11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.System] 11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.io.PrintStream, resolveIt: true 11:37:04.686 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.io.PrintStream] Initing CounterApp... 11:37:04.687 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: s4app.ClockPE, resolveIt: true 11:37:04.687 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [s4app.ClockPE] 11:37:04.687 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.ProcessingElement, resolveIt: true 11:37:04.689 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.ProcessingElement] 11:37:04.689 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [s4app.ClockPE] 11:37:04.692 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.slf4j.LoggerFactory, resolveIt: true 11:37:04.693 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.slf4j.LoggerFactory] 11:37:04.695 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.Streamable, resolveIt: true 11:37:04.695 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.Streamable] 11:37:04.708 [main] DEBUG o.a.s.c.g.OverloadDispatcherGenerator - Dumping generated overload dispatcher class for PE of class [class s4app.ClockPE] 11:37:04.714 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: OverloadDispatcher2841, resolveIt: true 11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [OverloadDispatcher2841] 11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.gen.OverloadDispatcher, resolveIt: true 11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.gen.OverloadDispatcher] 11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.Object, resolveIt: true 11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.Object] 11:37:04.716 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [OverloadDispatcher2841] 11:37:04.746 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.util.concurrent.TimeUnit, resolveIt: true 11:37:04.746 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.util.concurrent.TimeUnit] 11:37:04.747 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.EventSource, resolveIt: true 11:37:04.749 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.EventSource] Starting CounterApp... 11:37:04.751 [main] TRACE org.apache.s4.core.ProcessingElement - OnCreateInternal 11:37:04.757 [main] TRACE org.apache.s4.core.ProcessingElement - Num PE instances: 0. 11:37:04.757 [main] INFO org.apache.s4.core.Server - Starting app s4app.ShowTimeApp 11:37:04.757 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.System, resolveIt: true 11:37:04.757 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.System] 11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.io.PrintStream, resolveIt: true 11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.io.PrintStream] Initing ShowTimeApp... 11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: s4app.ShowPE, resolveIt: true 11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [s4app.ShowPE] 11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.ProcessingElement, resolveIt: true 11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.ProcessingElement] 11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [s4app.ShowPE] 11:37:04.758 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.slf4j.LoggerFactory, resolveIt: true 11:37:04.759 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.slf4j.LoggerFactory] 11:37:04.759 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.base.Event, resolveIt: true 11:37:04.759 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.base.Event] 11:37:04.760 [main] DEBUG o.a.s.c.g.OverloadDispatcherGenerator - Dumping generated overload dispatcher class for PE of class [class s4app.ShowPE] 11:37:04.761 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: OverloadDispatcher783, resolveIt: true 11:37:04.761 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Not a system class [OverloadDispatcher783] 11:37:04.761 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.core.gen.OverloadDispatcher, resolveIt: true 11:37:04.761 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.core.gen.OverloadDispatcher] 11:37:04.762 [main] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.Object, resolveIt: true 11:37:04.762 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.Object] 11:37:04.762 [main] DEBUG o.a.s4.base.util.MultiClassLoader - Returning newly loaded class [OverloadDispatcher783] Starting ShowTimeApp... 11:37:04.763 [main] TRACE org.apache.s4.core.ProcessingElement - OnCreateInternal 11:37:04.763 [main] TRACE org.apache.s4.core.ProcessingElement - Num PE instances: 0. 11:37:04.763 [main] INFO org.apache.s4.core.Server - Resolving dependencies. 11:37:04.763 [main] INFO org.apache.s4.core.Server - Resolving dependencies for s4app.ClockApp 11:37:04.764 [main] INFO org.apache.s4.core.Server - App [s4app.ClockApp] exports event source [I can give you the time!] . 11:37:04.764 [main] INFO org.apache.s4.core.Server - Resolving dependencies for s4app.ShowTimeApp 11:37:04.764 [main] INFO org.apache.s4.core.Server - The consumer app is [s4app.ShowTimeApp] . 11:37:04.764 [main] INFO org.apache.s4.core.Server - Subscribing stream [I need the time.] from app [s4app.ShowTimeApp] to event source. 11:37:04.764 [main] INFO org.apache.s4.core.EventSource - Subscribing stream: I need the time. to event source: I can give you the time!. 11:37:04.764 [main] INFO org.apache.s4.core.Server - Completed applications startup. 11:37:05.749 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.apache.s4.base.Event, resolveIt: true 11:37:05.749 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.apache.s4.base.Event] 11:37:05.749 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.Long, resolveIt: true 11:37:05.749 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.Long] 11:37:05.752 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.slf4j.Logger, resolveIt: true 11:37:05.752 [Timer-0] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.slf4j.Logger] 11:37:05.752 [Timer-0] INFO s4app.ClockPE - Sending event with tick 0 and time 1323545825749. 11:37:05.761 [I need the time.] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: java.lang.Long, resolveIt: true 11:37:05.761 [I need the time.] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [java.lang.Long] 11:37:05.761 [I need the time.] DEBUG o.a.s4.base.util.MultiClassLoader - MultiClassLoader loadClass - className: org.slf4j.Logger, resolveIt: true 11:37:05.761 [I need the time.] DEBUG o.a.s4.base.util.MultiClassLoader - Returning system class (in CLASSPATH) [org.slf4j.Logger] 11:37:05.761 [I need the time.] INFO s4app.ShowPE - Received event with tick 0 and time 1323545825749. 11:37:06.749 [Timer-0] INFO s4app.ClockPE - Sending event with tick 1 and time 1323545826749. 11:37:06.749 [I need the time.] INFO s4app.ShowPE - Received event with tick 1 and time 1323545826749. 11:37:07.750 [Timer-0] INFO s4app.ClockPE - Sending event with tick 2 and time 1323545827750. 11:37:07.750 [I need the time.] INFO s4app.ShowPE - Received event with tick 2 and time 1323545827750. 11:37:08.751 [Timer-0] INFO s4app.ClockPE - Sending event with tick 3 and time 1323545828751. 11:37:08.752 [I need the time.] INFO s4app.ShowPE - Received event with tick 3 and time 1323545828751. 11:37:09.752 [Timer-0] INFO s4app.ClockPE - Sending event with tick 4 and time 1323545829752. 11:37:09.752 [I need the time.] INFO s4app.ShowPE - Received event with tick 4 and time 1323545829752. 11:37:10.753 [Timer-0] INFO s4app.ClockPE - Sending event with tick 5 and time 1323545830753. 11:37:10.754 [I need the time.] INFO s4app.ShowPE - Received event with tick 5 and time 1323545830753. 11:37:11.754 [Timer-0] INFO s4app.ClockPE - Sending event with tick 6 and time 1323545831754. 11:37:11.754 [I need the time.] INFO s4app.ShowPE - Received event with tick 6 and time 1323545831754. 11:37:12.755 [Timer-0] INFO s4app.ClockPE - Sending event with tick 7 and time 1323545832755. 11:37:12.755 [I need the time.] INFO s4app.ShowPE - Received event with tick 7 and time 1323545832755. 11:37:13.757 [Timer-0] INFO s4app.ClockPE - Sending event with tick 8 and time 1323545833757. 11:37:13.757 [I need the time.] INFO s4app.ShowPE - Received event with tick 8 and time 1323545833757. 11:37:14.756 [Timer-0] INFO s4app.ClockPE - Sending event with tick 9 and time 1323545834756. 11:37:14.756 [I need the time.] INFO s4app.ShowPE - Received event with tick 9 and time 1323545834756. 11:37:15.758 [Timer-0] INFO s4app.ClockPE - Sending event with tick 10 and time 1323545835757. 11:37:15.758 [I need the time.] INFO s4app.ShowPE - Received event with tick 10 and time 1323545835757.
        Hide
        Leo Neumeyer added a comment -

        It works! Two independent apps with no compile-time dependencies on each other get loaded dynamically and talk to each other without using any complicated framework, just by extending the abstract APP class. It is surprisingly simple. Once they get loaded into the running system they start producing and consuming events. Not as exciting as the first ARPANET message but...

        Next step is to put the app-dependency info (which app listens to what app via an event source) in the run-time tables that Matthieu is working on.

        We also need to run a test in which each app uses a different version of the same jar.

        Show
        Leo Neumeyer added a comment - It works! Two independent apps with no compile-time dependencies on each other get loaded dynamically and talk to each other without using any complicated framework, just by extending the abstract APP class. It is surprisingly simple. Once they get loaded into the running system they start producing and consuming events. Not as exciting as the first ARPANET message but... Next step is to put the app-dependency info (which app listens to what app via an event source) in the run-time tables that Matthieu is working on. We also need to run a test in which each app uses a different version of the same jar.
        Hide
        Leo Neumeyer added a comment -

        The dispatcher error looks like this:

        18:53:47.409 [I need the time.] ERROR OverloadDispatcher134 - Cannot dispatch event of type [org.apache.s4.base.Event] to PE of type [s4app.ShowPE] : no matching onEvent method found

        I changed the Event class and is no longer abstract, I wonder if it is related to that.

        Show
        Leo Neumeyer added a comment - The dispatcher error looks like this: 18:53:47.409 [I need the time.] ERROR OverloadDispatcher134 - Cannot dispatch event of type [org.apache.s4.base.Event] to PE of type [s4app.ShowPE] : no matching onEvent method found I changed the Event class and is no longer abstract, I wonder if it is related to that.
        Hide
        Leo Neumeyer added a comment -

        Update:

        Show
        Leo Neumeyer added a comment - Update: The producer: https://github.com/leoneu/s4-counter The consumer: https://github.com/leoneu/s4-showtime I modified Server and other classes to implement app dependency resolution. It is mostly hardcoded for this case for now. The branch is app-dep https://github.com/leoneu/s4-piper/tree/app-dep Producer sends ticks, consumer listens and prints. In Server I look for the producer app with an EventSource and the consumer with a stream by name. I am getting an OverloadDispatcher error (still debugging) but the code shows what we need to do to resolve app dependencies dynamically.
        Hide
        Leo Neumeyer added a comment -

        Changed ClockApp to use the new generic Event class with dynamic attributes to avoid inter-app dependencies.

        Show
        Leo Neumeyer added a comment - Changed ClockApp to use the new generic Event class with dynamic attributes to avoid inter-app dependencies.
        Hide
        Leo Neumeyer added a comment -

        I created an example of an App that exports events using EventSource. Consumer apps need to subscribe to the event source and to start receiving the events.

        https://github.com/leoneu/s4-counter

        • ClockApp: The producer app.
        • ClockPE: Generates events of type ClockEvent at periodic intervals. The payload is the time.
        • ClockEvent: The event type to be exported. (Consumer apps can get the time.)

        Now other apps can start consuming without creating dependencies between apps at compile time. The shared event needs to be known by all the apps.

        The Server class will need to get the Streamables using the App getter method: public List<Streamable<? extends Event>> getStreams() and figure out which streamable needs to bind based on the deployment descriptor.

        NOTE: In this example I am using ClockEvent which is defined in ClockApp but it doesn't have to be that way. To further reduce dependencies we can use a GenericEvent which we haven't implemented yet. The Generic Event is in the S4 libraries so it can be easily shared by all apps.
        Does this make sense? Comments? Once we agree on this, we can move to the Adaptor discussion.

        -leo

        Show
        Leo Neumeyer added a comment - I created an example of an App that exports events using EventSource. Consumer apps need to subscribe to the event source and to start receiving the events. https://github.com/leoneu/s4-counter ClockApp: The producer app. ClockPE: Generates events of type ClockEvent at periodic intervals. The payload is the time. ClockEvent: The event type to be exported. (Consumer apps can get the time.) Now other apps can start consuming without creating dependencies between apps at compile time. The shared event needs to be known by all the apps. The Server class will need to get the Streamables using the App getter method: public List<Streamable<? extends Event>> getStreams() and figure out which streamable needs to bind based on the deployment descriptor. NOTE: In this example I am using ClockEvent which is defined in ClockApp but it doesn't have to be that way. To further reduce dependencies we can use a GenericEvent which we haven't implemented yet. The Generic Event is in the S4 libraries so it can be easily shared by all apps. Does this make sense? Comments? Once we agree on this, we can move to the Adaptor discussion. -leo
        Hide
        Leo Neumeyer added a comment -

        I think that the issue is in the details, that is, what design leads to the simplest API. I argue that for App2 to receive events from App1, we want App2 to have access to a local reference of App1. (Do we agree on this? otherwise I may be missing something.) By deploying all apps in all logical nodes, we accomplish that. I think this is an easy first step, we can later relax the requirements. The only case in which this could be a problem is in a very large cluster where it may be desirable not to update all the nodes. I doubt that having gigantic clusters will be practical. Having smaller ones may be a better option. In any case, we could relax the requirement as follows:

        • Because App2 does not export its stream, it only needs to be in the cluster in which it runs.
        • Because App1 is only consumed by App2, App1 only needs to be deployed in the cluster in which it runs and in the cluster in which App2 runs.

        So symmetry is not an absolute requirement, we just need to determine the dependencies and launch the app to the right nodes which is pretty neat. However, I wouldn't worry about this right now. We can add more deployment options later.

        I think that the next step is to prototype this by writing App1 and App2 and defining the APIs and process for resolving dependencies. I'll write something soon.

        -leo

        Show
        Leo Neumeyer added a comment - I think that the issue is in the details, that is, what design leads to the simplest API. I argue that for App2 to receive events from App1, we want App2 to have access to a local reference of App1. (Do we agree on this? otherwise I may be missing something.) By deploying all apps in all logical nodes, we accomplish that. I think this is an easy first step, we can later relax the requirements. The only case in which this could be a problem is in a very large cluster where it may be desirable not to update all the nodes. I doubt that having gigantic clusters will be practical. Having smaller ones may be a better option. In any case, we could relax the requirement as follows: Because App2 does not export its stream, it only needs to be in the cluster in which it runs. Because App1 is only consumed by App2, App1 only needs to be deployed in the cluster in which it runs and in the cluster in which App2 runs. So symmetry is not an absolute requirement, we just need to determine the dependencies and launch the app to the right nodes which is pretty neat. However, I wouldn't worry about this right now. We can add more deployment options later. I think that the next step is to prototype this by writing App1 and App2 and defining the APIs and process for resolving dependencies. I'll write something soon. -leo
        Hide
        Matthieu Morel added a comment -

        I don't understand the intended limitation on the symmetry between cluster.
        My alternative view is that we have symmetry of apps within a cluster but not between clusters.

        • Within the same logical cluster, orange and red apps communicate through the gray stream

        PEs from the red app send data on the grey stream according to a partitioning function that takes into account the number of nodes in the target cluster (in this case, the same cluster).

        • Between clusters, green app communicates with orange app through the red stream. There is no symmetry between those apps, because they belong to logical clusters of different sizes, but we use the same mechanism for partitioning data (that info needs to be derivable from the stream configuration).
        • How does the adaptor fit there?

        1. If the adaptor is an S4 app, simply configure the adaptor on a specific logical cluster. It can be constituted of 1 node, that would be the case in our twitter example. But it may also be constitued of several nodes, when possible, in order to distribute load and processing. This is the same approach than in S4 pre-0.5 (app cluster and adapter cluster).

        2. send S4 events on a stream. Events will be distributed to the apps registered to that stream on the target cluster.

        • Bottom line: I don't think we need symmetric between clusters, but we must be able to retrieve or rebuild a partitioning scheme from the strem configuration.
        Show
        Matthieu Morel added a comment - I don't understand the intended limitation on the symmetry between cluster. My alternative view is that we have symmetry of apps within a cluster but not between clusters. Example: To fuel the discussion, here is an example of a typical setup involving streams, logical cluster, apps and physical nodes: https://docs.google.com/drawings/d/1koAjnGFYsXfUqrXrFc9z2xQCgia0F9qog7DZng0naLU/edit?hl=en_US Within the same logical cluster, orange and red apps communicate through the gray stream PEs from the red app send data on the grey stream according to a partitioning function that takes into account the number of nodes in the target cluster (in this case, the same cluster). Between clusters, green app communicates with orange app through the red stream. There is no symmetry between those apps, because they belong to logical clusters of different sizes, but we use the same mechanism for partitioning data (that info needs to be derivable from the stream configuration). How does the adaptor fit there? 1. If the adaptor is an S4 app, simply configure the adaptor on a specific logical cluster. It can be constituted of 1 node, that would be the case in our twitter example. But it may also be constitued of several nodes, when possible, in order to distribute load and processing. This is the same approach than in S4 pre-0.5 (app cluster and adapter cluster). 2. send S4 events on a stream. Events will be distributed to the apps registered to that stream on the target cluster. Bottom line: I don't think we need symmetric between clusters, but we must be able to retrieve or rebuild a partitioning scheme from the strem configuration.
        Hide
        Leo Neumeyer added a comment -

        Subcluster definition: group of nodes over which we distribute an S4 application AND we establish dependencies between applications across clusters. (If there was no dependency, the subclusters would be just plain separate clusters.)

        The key here is how to establish inter-app communication across clusters. An EventSource (ES) API is made available in a Twitter preprocessing cluster. When configured, the app that hosts the ES needs to know how to talk to, for example, SmatApp app hosted on a different subcluster. If nodes were symmetric, this can be done very easily without introducing any changes to the current API and to the apps. In fact, the app developer shouldn't have to know what the final configuration will be (single node, one clusters, 2 subclusters, etc.)

        If nodes were not symmetric across clusters, we would have to have a different approach, that's why Bruce and I concluded that a fully symmetric cluster was a better approach. There is very little downside and it's simpler, all nodes are still identical. In the future we could relax this requirements but it doesn't seem worth it to complicate things now.

        The main challenge now is to support multiple senders, one for each subcluster. Events would be handed to each sender and sender will decide if the event must be transmitted and to what node.

        Thoughts?

        Show
        Leo Neumeyer added a comment - Subcluster definition: group of nodes over which we distribute an S4 application AND we establish dependencies between applications across clusters. (If there was no dependency, the subclusters would be just plain separate clusters.) The key here is how to establish inter-app communication across clusters. An EventSource (ES) API is made available in a Twitter preprocessing cluster. When configured, the app that hosts the ES needs to know how to talk to, for example, SmatApp app hosted on a different subcluster. If nodes were symmetric, this can be done very easily without introducing any changes to the current API and to the apps. In fact, the app developer shouldn't have to know what the final configuration will be (single node, one clusters, 2 subclusters, etc.) If nodes were not symmetric across clusters, we would have to have a different approach, that's why Bruce and I concluded that a fully symmetric cluster was a better approach. There is very little downside and it's simpler, all nodes are still identical. In the future we could relax this requirements but it doesn't seem worth it to complicate things now. The main challenge now is to support multiple senders, one for each subcluster. Events would be handed to each sender and sender will decide if the event must be transmitted and to what node. Thoughts?
        Hide
        Leo Neumeyer added a comment -

        Diagram to explain how the multi-cluster configuration works.

        Show
        Leo Neumeyer added a comment - Diagram to explain how the multi-cluster configuration works.
        Hide
        Matthieu Morel added a comment -

        I also like the idea of deploying adaptors as S4 applications and therefore benefiting from the programming model and distributed nature of S4 apps.

        But I wonder how to partition and inject the data to be injected, in that design. For instance, the twitter stream example in S4 0.3 creates S4 events out of a single http connection. What component would do that with the new design?

        About inter-apps dependencies, I would suggest to create a separate ticket addressing it.

        In addition, I am in favor of defining a generic interface to the component that manages S4 apps. This interface could be exposed as a Rest API for instance. Indeed, we probably want to be able to automate deployments and some administration tasks, which is not possible through a web interface. The web interface will need a standardize access to the S4 cluster management system anyway, and implementing a UI with Play using Rest is quite straightforward.

        Show
        Matthieu Morel added a comment - I also like the idea of deploying adaptors as S4 applications and therefore benefiting from the programming model and distributed nature of S4 apps. But I wonder how to partition and inject the data to be injected, in that design. For instance, the twitter stream example in S4 0.3 creates S4 events out of a single http connection. What component would do that with the new design? About inter-apps dependencies, I would suggest to create a separate ticket addressing it. In addition, I am in favor of defining a generic interface to the component that manages S4 apps. This interface could be exposed as a Rest API for instance. Indeed, we probably want to be able to automate deployments and some administration tasks, which is not possible through a web interface. The web interface will need a standardize access to the S4 cluster management system anyway, and implementing a UI with Play using Rest is quite straightforward.
        Hide
        Leo Neumeyer added a comment -

        Bruce suggested a further generalization of this idea:

        Instead of one cluster, S4 will support many clusters that can be configured based on requirements. For example, a typical use case is to have to pre-process raw inbound events at high speed. This will require adapting the event format, filtering based on logic, etc. We can dedicate an S4 cluster to this function. The generic S4 cluster would be a separate one. In this case we have an adaptor (processes a single stream) and a general cluster (processes all streams allowing combinations across streams.) If we make all node clusters symmetric (they all have identical code running), then it is easy to do intercluster communication. We just need to support multiple senders, the API would be unchanged. The only complexity is in configuring the clusters and senders and configuring inter-cluster streams. Because an adaptor is also an App, there is very little we need to do to support adaptors.

        All inter-app dependencies need to be established during deployment using run-time tables via ZK. I think that it will be easier to start building a simple admin web app to do this. The alternative would be a command line interface but not sure it's necessary. The web app would be used to configure the cluster(s), assign names, load/unload apps, establish dependencies, monitor Apps/PEs, etc. SInce Play Framework seems to be the latest hot thing, maybe we can give it a try, any volunteers?

        Show
        Leo Neumeyer added a comment - Bruce suggested a further generalization of this idea: Instead of one cluster, S4 will support many clusters that can be configured based on requirements. For example, a typical use case is to have to pre-process raw inbound events at high speed. This will require adapting the event format, filtering based on logic, etc. We can dedicate an S4 cluster to this function. The generic S4 cluster would be a separate one. In this case we have an adaptor (processes a single stream) and a general cluster (processes all streams allowing combinations across streams.) If we make all node clusters symmetric (they all have identical code running), then it is easy to do intercluster communication. We just need to support multiple senders, the API would be unchanged. The only complexity is in configuring the clusters and senders and configuring inter-cluster streams. Because an adaptor is also an App, there is very little we need to do to support adaptors. All inter-app dependencies need to be established during deployment using run-time tables via ZK. I think that it will be easier to start building a simple admin web app to do this. The alternative would be a command line interface but not sure it's necessary. The web app would be used to configure the cluster(s), assign names, load/unload apps, establish dependencies, monitor Apps/PEs, etc. SInce Play Framework seems to be the latest hot thing, maybe we can give it a try, any volunteers?
        Hide
        J Mohamed Zahoor added a comment -

        When we do this, i think we should also provide a pre-serialized POJO version along with the standard JSON version of adapter..

        Show
        J Mohamed Zahoor added a comment - When we do this, i think we should also provide a pre-serialized POJO version along with the standard JSON version of adapter..

          People

          • Assignee:
            Matthieu Morel
            Reporter:
            Leo Neumeyer
          • Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development