Details

    • Type: Epic Epic
    • Status: Reopened
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: v0.9.4
    • Fix Version/s: None
    • Component/s: Master
    • Labels:
      None

      Description

      Multi master configurations for high availability and node scalability suffers from several deficiencies. This jira will serve as a coordination point for another implementation of the master's configuration storage mechanism and or a possible re-architecting of is code.

      These are some of the core issues:

      1) Translations for logical nodes (auto*Chain, logicalSource/Sink) does not work in mutli-master.
      2) Flows abstraction is not scalable.
      3) ZK-backend needs to be revisited to be scalable (single znodes vs several znodes).
      4) OOME error occurs due to ack sharing mechanism.
      5) Each master maintains its own command history.

      Fundamentally, this is due to two core issues:

      1) Not all state information is shared.
      2) Current implementation does not have a leader

        Issue Links

          Issues in Epic

          There are no issues in this epic.

            Activity

            Hide
            Jonathan Hsieh added a comment -

            I don't have a strong opinion but I want to point out that this would mean all nodes would need direct communication with ZK. Given that nodes are potentially deployed on unprivileged segments of the network, we'd be asking folks to expose ZK to some degree. This isn't all that different than exposing the master but it could make people shy away from using the same ZK quorum for Flume and, say, HBase.

            I've starting just familiarizing myself with the zk api – looks like zk has some mechanisms for auth and it seems reasonably documented here: http://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#sc_ZooKeeperAccessControl

            Show
            Jonathan Hsieh added a comment - I don't have a strong opinion but I want to point out that this would mean all nodes would need direct communication with ZK. Given that nodes are potentially deployed on unprivileged segments of the network, we'd be asking folks to expose ZK to some degree. This isn't all that different than exposing the master but it could make people shy away from using the same ZK quorum for Flume and, say, HBase. I've starting just familiarizing myself with the zk api – looks like zk has some mechanisms for auth and it seems reasonably documented here: http://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#sc_ZooKeeperAccessControl
            Hide
            Jonathan Hsieh added a comment - - edited

            I've spent 12 hours in airplanes in the past three days, read up on ZK semantics (http://www.usenix.org/event/atc10/tech/full_papers/Hunt.pdf) and want to propose another option. IMO this seems one step after the progression of designs in wiki page. (also the wiki needs to be a little clearer about what a client is – is the flume shell the client or a flume node a client?)

            Here's the main idea:

            The main interface we would define for the flume master's information is simply a zookeeper znode directory structure. Flume nodes would have read and write access to this znode structure via direct communications to the zk-quorom. The flume master also have r/w access to this and would essentially only serve as a read/write web-wrapper for the zk stored info and a server for flume shell clients. The e2e ack's are punted on in this sweep – for now we'd just use the existing ack management on the master code.

            I see a few potential design wins from this style:

            • This seems more consistent with other systems and their zk use patterns.
            • Nodes could "configure" themselves by attempting to directly write to zk.
            • This will greatly simplify the master. Heartbeat management would just use ephemeral nodes in zk. A feature like config translations could interact directly with zk and be completely decoupled from the master daemon.
            • This style seems to solve extensibility and future proofing issues. Future features may require extensions to this flume configuration store, and would just write more data to the zk structure. Examples of shared state include flow/tier/group info, links to different webpages on the different daemons, etc. These could be added incrementally without incurring a compatibility breaking change.

            The master currently has a few main tables:

            The logical node status table (heartbeats/liveness/state)
            The logical node configuration table (source/sink)
            The physical node -> logical node mapping table
            The Ack table.

            As a strawman, I'd think we would use a zk znode structure centered around two main "subdirectories" – the logical node subdir and the physical node subdir. Information would znodes with the subdirs. For example we could have a structure like this:

            flume/logicalnodes/<ln1>/
            flume/logicalnodes/<ln1>/source
            flume/logicalnodes/<ln1>/sink
            flume/logicalnodes/<ln1>/state* // this one is ephemeral
            flume/logicalnodes/<ln1>/group
            flume/logicalnodes/<ln1>/http
            flume/logicalnodes/<ln1>/json
            flume/physicalnodes/<pn1>/
            flume/physicalnodes/<pn1>/host
            flume/physicalnodes/<pn1>/lns // this is the location for a physical->logical mapping
            flume/physicalnodes/<pn1>/lns/<ln1>
            flume/physicalnodes/<pn1>/lns/<ln2>
            flume/physicalnodes/<pn1>/http
            flume/physicalnodes/<pn1>/json

            Show
            Jonathan Hsieh added a comment - - edited I've spent 12 hours in airplanes in the past three days, read up on ZK semantics ( http://www.usenix.org/event/atc10/tech/full_papers/Hunt.pdf ) and want to propose another option. IMO this seems one step after the progression of designs in wiki page. (also the wiki needs to be a little clearer about what a client is – is the flume shell the client or a flume node a client?) Here's the main idea: The main interface we would define for the flume master's information is simply a zookeeper znode directory structure. Flume nodes would have read and write access to this znode structure via direct communications to the zk-quorom. The flume master also have r/w access to this and would essentially only serve as a read/write web-wrapper for the zk stored info and a server for flume shell clients. The e2e ack's are punted on in this sweep – for now we'd just use the existing ack management on the master code. I see a few potential design wins from this style: This seems more consistent with other systems and their zk use patterns. Nodes could "configure" themselves by attempting to directly write to zk. This will greatly simplify the master. Heartbeat management would just use ephemeral nodes in zk. A feature like config translations could interact directly with zk and be completely decoupled from the master daemon. This style seems to solve extensibility and future proofing issues. Future features may require extensions to this flume configuration store, and would just write more data to the zk structure. Examples of shared state include flow/tier/group info, links to different webpages on the different daemons, etc. These could be added incrementally without incurring a compatibility breaking change. The master currently has a few main tables: The logical node status table (heartbeats/liveness/state) The logical node configuration table (source/sink) The physical node -> logical node mapping table The Ack table. As a strawman, I'd think we would use a zk znode structure centered around two main "subdirectories" – the logical node subdir and the physical node subdir. Information would znodes with the subdirs. For example we could have a structure like this: flume/logicalnodes/<ln1>/ flume/logicalnodes/<ln1>/source flume/logicalnodes/<ln1>/sink flume/logicalnodes/<ln1>/state* // this one is ephemeral flume/logicalnodes/<ln1>/group flume/logicalnodes/<ln1>/http flume/logicalnodes/<ln1>/json flume/physicalnodes/<pn1>/ flume/physicalnodes/<pn1>/host flume/physicalnodes/<pn1>/lns // this is the location for a physical->logical mapping flume/physicalnodes/<pn1>/lns/<ln1> flume/physicalnodes/<pn1>/lns/<ln2> flume/physicalnodes/<pn1>/http flume/physicalnodes/<pn1>/json
            Hide
            E. Sammer added a comment -

            I've created a placeholder wiki page in gh (for lack of a better public place to do so). https://github.com/cloudera/flume/wiki/Master-Redesign

            Let me know if I've expressed the options correctly. I didn't get into how these options work or their trade offs. I just wanted to make sure we agree on the basics of the options first.

            Show
            E. Sammer added a comment - I've created a placeholder wiki page in gh (for lack of a better public place to do so). https://github.com/cloudera/flume/wiki/Master-Redesign Let me know if I've expressed the options correctly. I didn't get into how these options work or their trade offs. I just wanted to make sure we agree on the basics of the options first.
            Hide
            E. Sammer added a comment -

            So jumping back to the multi-master issue (+1 to separating the discussions), I think the simplest approach is to just have a single active master coordinated via a znode + watch (ala HBase) and allowing take over during failures. Client static configs can keep the list of all masters and make an initial RPC to ask a random master who the leader is or we can drop down a level and just have clients get a ZK quorum list directly.

            I don't have a strong opinion but I want to point out that this would mean all nodes would need direct communication with ZK. Given that nodes are potentially deployed on unprivileged segments of the network, we'd be asking folks to expose ZK to some degree. This isn't all that different than exposing the master but it could make people shy away from using the same ZK quorum for Flume and, say, HBase.

            Show
            E. Sammer added a comment - So jumping back to the multi-master issue (+1 to separating the discussions), I think the simplest approach is to just have a single active master coordinated via a znode + watch (ala HBase) and allowing take over during failures. Client static configs can keep the list of all masters and make an initial RPC to ask a random master who the leader is or we can drop down a level and just have clients get a ZK quorum list directly. I don't have a strong opinion but I want to point out that this would mean all nodes would need direct communication with ZK. Given that nodes are potentially deployed on unprivileged segments of the network, we'd be asking folks to expose ZK to some degree. This isn't all that different than exposing the master but it could make people shy away from using the same ZK quorum for Flume and, say, HBase.
            Hide
            Disabled imported user added a comment -

            Good plan!

            Just to clarify - I don't think of 'flows' as anything more than a logical grouping of nodes, so that cross-cutting configuration changes (like compression, reliability, other features that require coordination amongst different tiers) can be sensibly applied by tools that update the configuration. The master can also choose failover chains for an agent from intermediate nodes belonging to the same flow. They are a way to isolate one 'kind' of data from another, and to allow users to manage several use cases of Flume on the same physical installation independently.

            So a flow is just a label. Nothing more complicated than that.

            Show
            Disabled imported user added a comment - Good plan! Just to clarify - I don't think of 'flows' as anything more than a logical grouping of nodes, so that cross-cutting configuration changes (like compression, reliability, other features that require coordination amongst different tiers) can be sensibly applied by tools that update the configuration. The master can also choose failover chains for an agent from intermediate nodes belonging to the same flow. They are a way to isolate one 'kind' of data from another, and to allow users to manage several use cases of Flume on the same physical installation independently. So a flow is just a label. Nothing more complicated than that.
            Hide
            Jonathan Hsieh added a comment -

            My initial intent was to have one place to focus on what I consider to be the fundamental issue with multi-master currently – all data not consistent across masters. We've seems to strayed onto other important and meaty concerns that depend upon fixing the fundamental issue or potentially don't need it.

            Would you be ok if we move the discussion about handling e2e acks in different manner to a separate epic?

            Likewise, at the moment we can't do flows 1.0 properly without getting this state sharing story correct. I think there is plenty of meat to move the flows 2.0 discussion to a different epic that assumes we get state sharing done.

            Sound good?

            Jon.

            Show
            Jonathan Hsieh added a comment - My initial intent was to have one place to focus on what I consider to be the fundamental issue with multi-master currently – all data not consistent across masters . We've seems to strayed onto other important and meaty concerns that depend upon fixing the fundamental issue or potentially don't need it. Would you be ok if we move the discussion about handling e2e acks in different manner to a separate epic? Likewise, at the moment we can't do flows 1.0 properly without getting this state sharing story correct. I think there is plenty of meat to move the flows 2.0 discussion to a different epic that assumes we get state sharing done. Sound good? Jon.
            Hide
            E. Sammer added a comment -

            I would say that this is a solved problem in a lot of ways. Messaging systems in general have patterns to deal with routing. I think the most simplistic and easy to grok is SMTP. Messages come from a source and have a final destination (i.e. the email address). They look up an MX for their destination domain (i.e. a group of nodes capable of receiving a message in an equivalent manner - what Jon calls a tier above, but without the implication of hierarchy - I like "group" as a name better).

            In Flume speak, I'd propose:

            • we continue to have a source and a (final) sink.
            • we create the notion of a group; a set of nodes that are all equivalent. They should work just like MX records in DNS: giving two nodes in a group the same cost produces round robin, otherwise it's a preference, lowest first.
            • Sinks then target groups OR individual nodes (in the case where there's no failover).
            • As a message passes through a series of hopes, we inject 'Received:' style records.
            • ACKs are delivered in reverse Received order.
            • If a topology changes while trying to deliver an ACK, we drop all ACKs for that path (i.e. a flow) and allow natural retransmission to occur.
            • Routing loops etc are purely a configuration time check.
            • We allow for durability between any two hops. Ex: topology 1: A -> B -> HDFS, A retains the WAL, B does ACKs to A. topology 2: A -> B -> C -> D -> HDFS, A -> is E2E, but maybe C -> D is BE. If it's A -> D, the ACK just has farther to go (and goes reverse Received order).

            Simple. Known to work. Fast (except ACKs might travel through more nodes, but who cares, they're small.).

            Show
            E. Sammer added a comment - I would say that this is a solved problem in a lot of ways. Messaging systems in general have patterns to deal with routing. I think the most simplistic and easy to grok is SMTP. Messages come from a source and have a final destination (i.e. the email address). They look up an MX for their destination domain (i.e. a group of nodes capable of receiving a message in an equivalent manner - what Jon calls a tier above, but without the implication of hierarchy - I like "group" as a name better). In Flume speak, I'd propose: we continue to have a source and a (final) sink. we create the notion of a group; a set of nodes that are all equivalent. They should work just like MX records in DNS: giving two nodes in a group the same cost produces round robin, otherwise it's a preference, lowest first. Sinks then target groups OR individual nodes (in the case where there's no failover). As a message passes through a series of hopes, we inject 'Received:' style records. ACKs are delivered in reverse Received order. If a topology changes while trying to deliver an ACK, we drop all ACKs for that path (i.e. a flow) and allow natural retransmission to occur. Routing loops etc are purely a configuration time check. We allow for durability between any two hops. Ex: topology 1: A -> B -> HDFS, A retains the WAL, B does ACKs to A. topology 2: A -> B -> C -> D -> HDFS, A -> is E2E, but maybe C -> D is BE. If it's A -> D, the ACK just has farther to go (and goes reverse Received order). Simple. Known to work. Fast (except ACKs might travel through more nodes, but who cares, they're small.).
            Hide
            Jonathan Hsieh added a comment -

            >> Scenario 1) Suppose I want to have a flow where we have source data at an agent, that then goes to some intermediate node for processing and then goes to a collector. With the current model, I would tag all the nodes to be in the same flow. How does an agent know it needs to sent intermediate processing node instead of the collector?

            >Isn't that what the sink is for? The sink encodes information about the next hop. The flow just restricts the set of nodes (for load balancing) that the next hop can be drawn from.

            I'm a little confused by the first question. When you speak of sinks, which sink are you talking about – the rpc sink on the agent, the processor or the hdfs/hbase/flumebase on a collector?

            The current properties and constraints on flows 1.0 are that 1) "flow-is-a-set-of-nodes" and 2) "nodes-are-only-in-one-flow". The problem really shows up when there are processor in the middle. It is both a source (for the collector) and a sink (for the agent). If flows 1.0 are really focused on edge sets, between two sets constraint #2 gets in the way because the intermediate node would need to be part of two flows. If flows 1.0 are focused on end-to-end situations, then constraint #1 complicates the logic needed to express automatically generated topologies.

            On interesting suggestion is to have something like a logicalSources and logicalSink(logicalNodeId) that uses a tier id as instead of logicalNodeId. This would give a name for the set of edges between tiers of nodes.

            > This is a problem if the query 'tier' belongs to more than one flow, otherwise it fits fine with the abstraction. Tiers are a nice abstraction as well; when combined with flows they give horizontal and vertical abstractions. The relationship between tiers and flows need to be explicit: it sounds like each flow is comprised of tiers and no tier can belong to more than one flow. That sounds a good plan to me, as it seems like you still need flows to handle end-to-end constructs like reliability and compression.

            Let's make sure we are talking about the same thing when we say flow. I'll define a flow to be a single potentially multiple hop path from root (sources) to a final sink. With this definition, there would be one flow in scenario 1 and two flows in scenario 2 example: an [agent -> collector] flow and an [agent -> query ] flow.

            Can you give an example of the problem you are talking about? Is the concern you bring up something the DAG situation? [sources1 -> query],
            [sources2 -> query] (DAG vs Tree?)? Or is it that now that the someone has to tell agent to send to both the query and collector destinations?

            My first thought is that a source tier sending to multiple destination tiers is reasonable. We may need to enforce a restriction where only be one end-to-end flow is reliable, but is doable with knowledge of the tier-to-tier flows.

            I think that the DAG situation, two different source tiers going to the same destination tier, is reasonable as well. I've seen production scenarios where folks are collecting two different sources of data and sending them to the same collector that uses the output bucketing feature to demux data when it writes to hdfs.

            I see generally cycles as a problem and out of scope for this.

            I think there are different contexts to apply properties on a flow or part of a flow. Ideally this is something that could be encoded and stored at the master. Some properties like reliability are likely an end-to-end properties of a flow. Compression, batching and encryption are likely only a property on an edge between tiers. For example, let's say we have a scenario where we send data to a "local" relay using cheap batching and lzo compression, but then want to use "expensive" gzip compression and encryption for a wan connection to a remote collector. We want end-to-end reliability overall, but different compression algorithms on different tier connections.

            Show
            Jonathan Hsieh added a comment - >> Scenario 1) Suppose I want to have a flow where we have source data at an agent, that then goes to some intermediate node for processing and then goes to a collector. With the current model, I would tag all the nodes to be in the same flow. How does an agent know it needs to sent intermediate processing node instead of the collector? >Isn't that what the sink is for? The sink encodes information about the next hop. The flow just restricts the set of nodes (for load balancing) that the next hop can be drawn from. I'm a little confused by the first question. When you speak of sinks, which sink are you talking about – the rpc sink on the agent, the processor or the hdfs/hbase/flumebase on a collector? The current properties and constraints on flows 1.0 are that 1) "flow-is-a-set-of-nodes" and 2) "nodes-are-only-in-one-flow". The problem really shows up when there are processor in the middle. It is both a source (for the collector) and a sink (for the agent). If flows 1.0 are really focused on edge sets, between two sets constraint #2 gets in the way because the intermediate node would need to be part of two flows. If flows 1.0 are focused on end-to-end situations, then constraint #1 complicates the logic needed to express automatically generated topologies. On interesting suggestion is to have something like a logicalSources and logicalSink(logicalNodeId) that uses a tier id as instead of logicalNodeId. This would give a name for the set of edges between tiers of nodes. > This is a problem if the query 'tier' belongs to more than one flow, otherwise it fits fine with the abstraction. Tiers are a nice abstraction as well; when combined with flows they give horizontal and vertical abstractions. The relationship between tiers and flows need to be explicit: it sounds like each flow is comprised of tiers and no tier can belong to more than one flow. That sounds a good plan to me, as it seems like you still need flows to handle end-to-end constructs like reliability and compression. Let's make sure we are talking about the same thing when we say flow. I'll define a flow to be a single potentially multiple hop path from root (sources) to a final sink. With this definition, there would be one flow in scenario 1 and two flows in scenario 2 example: an [agent -> collector] flow and an [agent -> query ] flow. Can you give an example of the problem you are talking about? Is the concern you bring up something the DAG situation? [sources1 -> query] , [sources2 -> query] (DAG vs Tree?)? Or is it that now that the someone has to tell agent to send to both the query and collector destinations? My first thought is that a source tier sending to multiple destination tiers is reasonable. We may need to enforce a restriction where only be one end-to-end flow is reliable, but is doable with knowledge of the tier-to-tier flows. I think that the DAG situation, two different source tiers going to the same destination tier, is reasonable as well. I've seen production scenarios where folks are collecting two different sources of data and sending them to the same collector that uses the output bucketing feature to demux data when it writes to hdfs. I see generally cycles as a problem and out of scope for this. I think there are different contexts to apply properties on a flow or part of a flow. Ideally this is something that could be encoded and stored at the master. Some properties like reliability are likely an end-to-end properties of a flow. Compression, batching and encryption are likely only a property on an edge between tiers. For example, let's say we have a scenario where we send data to a "local" relay using cheap batching and lzo compression, but then want to use "expensive" gzip compression and encryption for a wan connection to a remote collector. We want end-to-end reliability overall, but different compression algorithms on different tier connections.
            Hide
            Disabled imported user added a comment -

            > Scenario 1) Suppose I want to have a flow where we have source data at an agent, that then goes to some intermediate node for processing and then goes to a collector. With the current model, I would tag all the nodes to be in the same flow. How does an agent know it needs to sent intermediate processing node instead of the collector?

            Isn't that what the sink is for? The sink encodes information about the next hop. The flow just restricts the set of nodes (for load balancing) that the next hop can be drawn from.

            > Scenario 2) Suppose I want to have a flow where we have source data at an agent, and that we want to send that data to a collector and to a different fork where a different kind of processing happens (flume base query, or elastic search, or hbase, etc).

            This is a problem if the query 'tier' belongs to more than one flow, otherwise it fits fine with the abstraction. Tiers are a nice abstraction as well; when combined with flows they give horizontal and vertical abstractions. The relationship between tiers and flows need to be explicit: it sounds like each flow is comprised of tiers and no tier can belong to more than one flow. That sounds a good plan to me, as it seems like you still need flows to handle end-to-end constructs like reliability and compression.

            Show
            Disabled imported user added a comment - > Scenario 1) Suppose I want to have a flow where we have source data at an agent, that then goes to some intermediate node for processing and then goes to a collector. With the current model, I would tag all the nodes to be in the same flow. How does an agent know it needs to sent intermediate processing node instead of the collector? Isn't that what the sink is for? The sink encodes information about the next hop. The flow just restricts the set of nodes (for load balancing) that the next hop can be drawn from. > Scenario 2) Suppose I want to have a flow where we have source data at an agent, and that we want to send that data to a collector and to a different fork where a different kind of processing happens (flume base query, or elastic search, or hbase, etc). This is a problem if the query 'tier' belongs to more than one flow, otherwise it fits fine with the abstraction. Tiers are a nice abstraction as well; when combined with flows they give horizontal and vertical abstractions. The relationship between tiers and flows need to be explicit: it sounds like each flow is comprised of tiers and no tier can belong to more than one flow. That sounds a good plan to me, as it seems like you still need flows to handle end-to-end constructs like reliability and compression.
            Hide
            Jonathan Hsieh added a comment - - edited

            The concept of flows is a good one however, the abstractions (a flow is a set of nodes) that under lie the current implementation is insufficient to deal with some fairly basic scenarios.

            Let's start with some basic scenarios and strawman solutions:

            Currently, flows are limited to a linear two tier scenario.

            Scenario 1) Suppose I want to have a flow where we have source data at an agent, that then goes to some intermediate node for processing and then goes to a collector. With the current model, I would tag all the nodes to be in the same flow. How does an agent know it needs to sent intermediate processing node instead of the collector?

            Scenarior 2) Suppose I want to have a flow where we have source data at an agent, and that we want to send that data to a collector and to a different fork where a different kind of processing happens (flume base query, or elastic search, or hbase, etc).

            Possible solutions:

            1) We could add more custom smarts to the master (once it has all state information) which can "handle" the sceanrio 1 case. We could at the master detect if there are processor nodes in this flow group and then have special case code that would route to the processor and then from the processor to the collector. This is essentially another special topology. Scenario 2 does not fit this custom model so operators need to go back to the low level configuration and routing. We could probably add another special class of nodes (agents, collectors, processors, and auxsink?) and more special case code.

            2) We could add the abstraction of a tier. A tier would be a set of nodes (similar to the current flowid). The new flow abstraction would be an ordered series (or tree) of tiers. The tier abstraction allows us to group similar nodes. (webservers, collectors, incremental search indexers, hbase writers, processors, etc). The new flows abstractions allow us to setup routing from one tier to another. For scenario 1, all agents would be in the agent tier, all processors in the processor tier, and all collectors would be in the collector tier. The new flow would have be a the series of [ agent tier -> processor tier -> collector tier]. From this spec, deciding routing from agent to processor and processor to collector is fairly straight-forward. If we added another processing tier, it would remain straight-forward. For scenario 2, this scheme would also still be sufficient. All the agent would be in the agent tier, and then collectors in the collector tier and the forked destination would be in a separate tier. The flow would now include the tree that has [agent -> collector] and [agent->forked]. This abstraction also seems sufficient to combine scenario 1 and 2. (the flow would be the tree formed by the union of [agent -> processor -> collector] and [agent -> forked]

            Some questions:
            1) Can logical nodes belong to more than one tier? (initial guess is no)
            2) Can we allow cycles in tiers? ex: [ tier1 -> tier2 -> tier 1]. Most likely no in initial attempt.
            3) Can we check for cycles and enforce no cycles? (if logical nodes only in one tier, yes)
            4) Can this be encoded in the current Master? (No, would need to add new zk node to store data. This is the fourth time, [previous were configs, logical/physical mappings, chokes] so it is worth reconsidering the design. )

            Show
            Jonathan Hsieh added a comment - - edited The concept of flows is a good one however, the abstractions (a flow is a set of nodes) that under lie the current implementation is insufficient to deal with some fairly basic scenarios. Let's start with some basic scenarios and strawman solutions: Currently, flows are limited to a linear two tier scenario. Scenario 1) Suppose I want to have a flow where we have source data at an agent, that then goes to some intermediate node for processing and then goes to a collector. With the current model, I would tag all the nodes to be in the same flow. How does an agent know it needs to sent intermediate processing node instead of the collector? Scenarior 2) Suppose I want to have a flow where we have source data at an agent, and that we want to send that data to a collector and to a different fork where a different kind of processing happens (flume base query, or elastic search, or hbase, etc). Possible solutions: 1) We could add more custom smarts to the master (once it has all state information) which can "handle" the sceanrio 1 case. We could at the master detect if there are processor nodes in this flow group and then have special case code that would route to the processor and then from the processor to the collector. This is essentially another special topology. Scenario 2 does not fit this custom model so operators need to go back to the low level configuration and routing. We could probably add another special class of nodes (agents, collectors, processors, and auxsink?) and more special case code. 2) We could add the abstraction of a tier. A tier would be a set of nodes (similar to the current flowid). The new flow abstraction would be an ordered series (or tree) of tiers. The tier abstraction allows us to group similar nodes. (webservers, collectors, incremental search indexers, hbase writers, processors, etc). The new flows abstractions allow us to setup routing from one tier to another. For scenario 1, all agents would be in the agent tier, all processors in the processor tier, and all collectors would be in the collector tier. The new flow would have be a the series of [ agent tier -> processor tier -> collector tier]. From this spec, deciding routing from agent to processor and processor to collector is fairly straight-forward. If we added another processing tier, it would remain straight-forward. For scenario 2, this scheme would also still be sufficient. All the agent would be in the agent tier, and then collectors in the collector tier and the forked destination would be in a separate tier. The flow would now include the tree that has [agent -> collector] and [agent->forked] . This abstraction also seems sufficient to combine scenario 1 and 2. (the flow would be the tree formed by the union of [agent -> processor -> collector] and [agent -> forked] Some questions: 1) Can logical nodes belong to more than one tier? (initial guess is no) 2) Can we allow cycles in tiers? ex: [ tier1 -> tier2 -> tier 1]. Most likely no in initial attempt. 3) Can we check for cycles and enforce no cycles? (if logical nodes only in one tier, yes) 4) Can this be encoded in the current Master? (No, would need to add new zk node to store data. This is the fourth time, [previous were configs, logical/physical mappings, chokes] so it is worth reconsidering the design. )
            Hide
            Jonathan Hsieh added a comment - - edited

            Definitely plan on doing that and doing it here. This is significantly large of a task that will likely change APIs.

            I've made the issue type "epic". I think that anything marked "epic" should have a discussion about design. I believe the hbase connector had a similar cycle, as well as the maven one. The refactor of driver to fix flaky tests likely candidate where we could have done this.

            Show
            Jonathan Hsieh added a comment - - edited Definitely plan on doing that and doing it here. This is significantly large of a task that will likely change APIs. I've made the issue type "epic". I think that anything marked "epic" should have a discussion about design. I believe the hbase connector had a similar cycle, as well as the maven one. The refactor of driver to fix flaky tests likely candidate where we could have done this.
            Hide
            Disabled imported user added a comment -

            +1 to Eric's suggestion.

            Will this issue result in the concept of 'flows' being scrapped? I'd like to understand the rationale behind that, as several users have commented to me anecdotally that it's an easy abstraction to understand.

            Show
            Disabled imported user added a comment - +1 to Eric's suggestion. Will this issue result in the concept of 'flows' being scrapped? I'd like to understand the rationale behind that, as several users have commented to me anecdotally that it's an easy abstraction to understand.
            Hide
            E. Sammer added a comment -

            It would be super cool if we could produce a draft for this one prior to code. It's a relatively big feature and it would be great to have 1. the rationale behind the new design documented 2. a chance to vet design issues prior to development.

            Show
            E. Sammer added a comment - It would be super cool if we could produce a draft for this one prior to code. It's a relatively big feature and it would be great to have 1. the rationale behind the new design documented 2. a chance to vet design issues prior to development.

              People

              • Assignee:
                Unassigned
                Reporter:
                Jonathan Hsieh
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Development