Flume
  1. Flume
  2. FLUME-1479

Multiple Sinks can connect to single Channel

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not a Problem
    • Affects Version/s: v1.2.0
    • Fix Version/s: v1.3.0
    • Component/s: Configuration
    • Labels:
      None

      Description

      If we has one Channel (mc) and two Sinks (hsa, hsb), then they may be connected with each other with configuration example

      agent.sinks.hsa.channel = mc
      agent.sinks.hsb.channel = mc

      It means that there have multiple Sinks can connect to single Channel. Normally, one Sink only can connect to unified Channel

        Issue Links

          Activity

          Hide
          Yongkun Wang added a comment -

          I can access the jira now. Maybe it's better to copy the discussion here:

          On 12/08/10 14:48, "Wang, Yongkun | Yongkun | BDD" <yongkun.wang@mail.rakuten.com> wrote:

          Hi Denny,

          I am working on the patch now, it's not difficult. I have listed the changes in that JIRA.
          I think you misunderstand my design, I didn't maintain the order of the events. Instead I make sure that each sink will get the same events (or different events specified by selector).

          Suppose Channel (mc) contains the following events: 4,3,2,1

          If simply enable it by configuration, it may work like this:
          Sink "hsa" may get 1,3;
          Sink "hsb" may get 2,4;
          So different sink will get different data. Is this what user wants?

          In my design, "hsa" and "hsb" will both get "4,3,2,1". This is a typical case when user want to fan-out the data into two places (eg. One for batch and and another for real-time analysis).

          Regards,
          Yongkun Wang

          On 12/08/10 14:29, "Denny Ye" <dennyy99@gmail.com> wrote:

          hi Yongkun,

          JIRA can be accessed now.

          I think it might be difficult to understand the order of events from
          your thought. If we don't care about the order, can discuss the value and
          feasibility. In my opinion, data ingest flow is order unawareness, at
          least, not such important for us. You can try to verify your proposal and
          give us result. It may be some difficulties in keeping transaction with
          several Sinks.

          -Regards
          Denny Ye

          2012/8/10 Wang, Yongkun | Yongkun | BDD <yongkun.wang@mail.rakuten.com>

          JIRA is down again? I cannot connect to it and comment there.

          I have a proposal in "Transactional Multiplex (fan out) Sink"):
          https://issues.apache.org/jira/browse/FLUME-1435
          Which contains the design of one channel to multiple sinks.

          You can search the email since JIRA cannot be accessed.

          I think this is more than a configuration issue. If simply enable several sinks on the same channel, they will take it either in a round-robin mode or in a unpredictable mode if the speed of sinks are different.

          So it's better to have a even higher level transaction control instead of the transaction in the process() of each sink, as I describe in FLUME-1435.

          Regards,
          Yongkun Wang

          Show
          Yongkun Wang added a comment - I can access the jira now. Maybe it's better to copy the discussion here: On 12/08/10 14:48, "Wang, Yongkun | Yongkun | BDD" <yongkun.wang@mail.rakuten.com> wrote: Hi Denny, I am working on the patch now, it's not difficult. I have listed the changes in that JIRA. I think you misunderstand my design, I didn't maintain the order of the events. Instead I make sure that each sink will get the same events (or different events specified by selector). Suppose Channel (mc) contains the following events: 4,3,2,1 If simply enable it by configuration, it may work like this: Sink "hsa" may get 1,3; Sink "hsb" may get 2,4; So different sink will get different data. Is this what user wants? In my design, "hsa" and "hsb" will both get "4,3,2,1". This is a typical case when user want to fan-out the data into two places (eg. One for batch and and another for real-time analysis). Regards, Yongkun Wang On 12/08/10 14:29, "Denny Ye" <dennyy99@gmail.com> wrote: hi Yongkun, JIRA can be accessed now. I think it might be difficult to understand the order of events from your thought. If we don't care about the order, can discuss the value and feasibility. In my opinion, data ingest flow is order unawareness, at least, not such important for us. You can try to verify your proposal and give us result. It may be some difficulties in keeping transaction with several Sinks. -Regards Denny Ye 2012/8/10 Wang, Yongkun | Yongkun | BDD <yongkun.wang@mail.rakuten.com> JIRA is down again? I cannot connect to it and comment there. I have a proposal in "Transactional Multiplex (fan out) Sink"): https://issues.apache.org/jira/browse/FLUME-1435 Which contains the design of one channel to multiple sinks. You can search the email since JIRA cannot be accessed. I think this is more than a configuration issue. If simply enable several sinks on the same channel, they will take it either in a round-robin mode or in a unpredictable mode if the speed of sinks are different. So it's better to have a even higher level transaction control instead of the transaction in the process() of each sink, as I describe in FLUME-1435 . Regards, Yongkun Wang
          Hide
          Hari Shreedharan added a comment -

          What exactly is trying to be accomplished here? Multiple sinks can talk to the same channel. That is a part of the design. I don't see anything wrong in that.

          Also we cannot make changes to any existing behavior which are backward incompatible in any minor or point releases (such changes can be made only between 1.x -> 2.x releases). If a config which worked with a previous release does not work anymore, that is essentially backward incompatibility.

          Show
          Hari Shreedharan added a comment - What exactly is trying to be accomplished here? Multiple sinks can talk to the same channel. That is a part of the design. I don't see anything wrong in that. Also we cannot make changes to any existing behavior which are backward incompatible in any minor or point releases (such changes can be made only between 1.x -> 2.x releases). If a config which worked with a previous release does not work anymore, that is essentially backward incompatibility.
          Hide
          Denny Ye added a comment -

          It's the first time I know the comment 'Multiple sinks can talk to the same channel. That is a part of the design.'. All of changes of this patch is based on the standpoint 'single Sink can only talk to unified Channel.'. It might be my mistake of from Flume documentation for this character.

          Show
          Denny Ye added a comment - It's the first time I know the comment 'Multiple sinks can talk to the same channel. That is a part of the design.'. All of changes of this patch is based on the standpoint 'single Sink can only talk to unified Channel.'. It might be my mistake of from Flume documentation for this character.
          Hide
          Juhani Connolly added a comment -

          Multiple sinks pulling from the same channels is allowed behavior. While using some form of grouping(failover/loadbalancing) is generally recommended, if people have a usecase that requires multiple sinks to pull from the same channel, it is entirely valid, and so long as the channel is a consistent one, at least one of the sinks will get a copy of any event that passes into it.

          Show
          Juhani Connolly added a comment - Multiple sinks pulling from the same channels is allowed behavior. While using some form of grouping(failover/loadbalancing) is generally recommended, if people have a usecase that requires multiple sinks to pull from the same channel, it is entirely valid, and so long as the channel is a consistent one, at least one of the sinks will get a copy of any event that passes into it.
          Hide
          Denny Ye added a comment -

          Multiple Sinks can connect common channel is valid activity at SinkProcessor even if we are using load balance model

          Show
          Denny Ye added a comment - Multiple Sinks can connect common channel is valid activity at SinkProcessor even if we are using load balance model

            People

            • Assignee:
              Denny Ye
              Reporter:
              Denny Ye
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development