Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2712

Optional channel errors slows down the Source to Main channel event rate

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.7.0
    • Component/s: None
    • Labels:
      None

      Description

      When we have a source configured to deliver events to a main channel and an optional channel, and if the delivery to optional channel fails, this significantly slows down the rate at which events are delivered to the main channel by the source.
      We need to evaluate async means of delivering events to the optional channel and isolate the errors happening in optional channel from the delivery to the main channel

      1. FLUME-2712.patch
        7 kB
        Johny Rufus
      2. FLUME-2712.patch
        7 kB
        Johny Rufus
      3. FLUME-2712-1.patch
        10 kB
        Johny Rufus
      4. FLUME-2712-2.patch
        10 kB
        Johny Rufus

        Activity

        Hide
        hudson Hudson added a comment -

        UNSTABLE: Integrated in Flume-trunk-hbase-1 #152 (See https://builds.apache.org/job/Flume-trunk-hbase-1/152/)
        FLUME-2891: Revert FLUME-2712 and FLUME-2886 (jarcec: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=caa64a1a6d4bc97be5993cb468516e9ffe862794)

        • flume-ng-core/src/test/java/org/apache/flume/channel/TestChannelProcessor.java
        • flume-ng-core/src/main/java/org/apache/flume/channel/ChannelProcessor.java
        Show
        hudson Hudson added a comment - UNSTABLE: Integrated in Flume-trunk-hbase-1 #152 (See https://builds.apache.org/job/Flume-trunk-hbase-1/152/ ) FLUME-2891 : Revert FLUME-2712 and FLUME-2886 (jarcec: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=caa64a1a6d4bc97be5993cb468516e9ffe862794 ) flume-ng-core/src/test/java/org/apache/flume/channel/TestChannelProcessor.java flume-ng-core/src/main/java/org/apache/flume/channel/ChannelProcessor.java
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit c2562900f1ab9c21ae2d7c0fc26e270847e81b6d in flume's branch refs/heads/flume-1.7 from Jarek Jarcec Cecho
        [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=c256290 ]

        FLUME-2891: Revert FLUME-2712 and FLUME-2886

        (Hari Shreedharan via Jarek Jarcec Cecho)

        Show
        jira-bot ASF subversion and git services added a comment - Commit c2562900f1ab9c21ae2d7c0fc26e270847e81b6d in flume's branch refs/heads/flume-1.7 from Jarek Jarcec Cecho [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=c256290 ] FLUME-2891 : Revert FLUME-2712 and FLUME-2886 (Hari Shreedharan via Jarek Jarcec Cecho)
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit caa64a1a6d4bc97be5993cb468516e9ffe862794 in flume's branch refs/heads/trunk from Jarek Jarcec Cecho
        [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=caa64a1 ]

        FLUME-2891: Revert FLUME-2712 and FLUME-2886

        (Hari Shreedharan via Jarek Jarcec Cecho)

        Show
        jira-bot ASF subversion and git services added a comment - Commit caa64a1a6d4bc97be5993cb468516e9ffe862794 in flume's branch refs/heads/trunk from Jarek Jarcec Cecho [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=caa64a1 ] FLUME-2891 : Revert FLUME-2712 and FLUME-2886 (Hari Shreedharan via Jarek Jarcec Cecho)
        Hide
        hudson Hudson added a comment -

        UNSTABLE: Integrated in Flume-trunk-hbase-1 #132 (See https://builds.apache.org/job/Flume-trunk-hbase-1/132/)
        FLUME-2712. Optional channel errors slows down the Source to Main (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=8bb556604047974775eb2da4c5c1686d89fe62d2)

        • flume-ng-core/src/test/java/org/apache/flume/channel/TestChannelProcessor.java
        • flume-ng-core/src/main/java/org/apache/flume/channel/ChannelProcessor.java
        Show
        hudson Hudson added a comment - UNSTABLE: Integrated in Flume-trunk-hbase-1 #132 (See https://builds.apache.org/job/Flume-trunk-hbase-1/132/ ) FLUME-2712 . Optional channel errors slows down the Source to Main (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=8bb556604047974775eb2da4c5c1686d89fe62d2 ) flume-ng-core/src/test/java/org/apache/flume/channel/TestChannelProcessor.java flume-ng-core/src/main/java/org/apache/flume/channel/ChannelProcessor.java
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Committed! Thanks Johny Rufus!

        Show
        hshreedharan Hari Shreedharan added a comment - Committed! Thanks Johny Rufus !
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit a7ffb27014f5808633f829b5f1e72a0290bd35a5 in flume's branch refs/heads/flume-1.7 from Hari Shreedharan
        [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=a7ffb27 ]

        FLUME-2712. Optional channel errors slows down the Source to Main channel event rate

        (Johny Rufus via Hari)

        Show
        jira-bot ASF subversion and git services added a comment - Commit a7ffb27014f5808633f829b5f1e72a0290bd35a5 in flume's branch refs/heads/flume-1.7 from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=a7ffb27 ] FLUME-2712 . Optional channel errors slows down the Source to Main channel event rate (Johny Rufus via Hari)
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 8bb556604047974775eb2da4c5c1686d89fe62d2 in flume's branch refs/heads/trunk from Hari Shreedharan
        [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=8bb5566 ]

        FLUME-2712. Optional channel errors slows down the Source to Main channel event rate

        (Johny Rufus via Hari)

        Show
        jira-bot ASF subversion and git services added a comment - Commit 8bb556604047974775eb2da4c5c1686d89fe62d2 in flume's branch refs/heads/trunk from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=8bb5566 ] FLUME-2712 . Optional channel errors slows down the Source to Main channel event rate (Johny Rufus via Hari)
        Hide
        hshreedharan Hari Shreedharan added a comment -

        +1. LGTM. Running tests and committing

        Show
        hshreedharan Hari Shreedharan added a comment - +1. LGTM. Running tests and committing
        Hide
        jrufus Johny Rufus added a comment -

        Hari Shreedharan, Attaching patch with review comments incorporated

        Show
        jrufus Johny Rufus added a comment - Hari Shreedharan , Attaching patch with review comments incorporated
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Johny,

        This looks good in general. Some minor comments:

        • I think the executeRequiredChannelTransaction can be executeChannelTransaction and can be used for both. Just take an additional flag as an argument and throw the ChannelException based on that. Otherwise the run method in OptionalChannelTransactionExecutor are exactly the same
        • OptionalChannelTransactionExecutor should be named OptionalChannelTransactionThread or something like that - it really is not an executor.
        • List<Event> events = new ArrayList<Event>() -> List<Event> events = new ArrayList<Event>(1) in processEvent method.
        Show
        hshreedharan Hari Shreedharan added a comment - Johny, This looks good in general. Some minor comments: I think the executeRequiredChannelTransaction can be executeChannelTransaction and can be used for both. Just take an additional flag as an argument and throw the ChannelException based on that. Otherwise the run method in OptionalChannelTransactionExecutor are exactly the same OptionalChannelTransactionExecutor should be named OptionalChannelTransactionThread or something like that - it really is not an executor. List<Event> events = new ArrayList<Event>() -> List<Event> events = new ArrayList<Event>(1) in processEvent method.
        Hide
        jrufus Johny Rufus added a comment -

        Attaching patch with the test case and single thread of execution for optional channel

        Show
        jrufus Johny Rufus added a comment - Attaching patch with the test case and single thread of execution for optional channel
        Hide
        jrufus Johny Rufus added a comment -

        Thanks Hari Shreedharan for the catch. I thought we generally dont guarantee ordering, but it makes sense not to break the existing level of ordering. Submitting to a single threaded executor will guarantee the sequential execution of jobs and it works out of an unbounded queue, so let me make that change. Will also work on a test case.

        Show
        jrufus Johny Rufus added a comment - Thanks Hari Shreedharan for the catch. I thought we generally dont guarantee ordering, but it makes sense not to break the existing level of ordering. Submitting to a single threaded executor will guarantee the sequential execution of jobs and it works out of an unbounded queue, so let me make that change. Will also work on a test case.
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Also please add a test to ensure no data gets dropped

        Show
        hshreedharan Hari Shreedharan added a comment - Also please add a test to ensure no data gets dropped
        Hide
        hshreedharan Hari Shreedharan added a comment -

        In general, the idea looks ok. But there is one problem with this patch - the ordering of events in the optional channel is now messed up, since we are using a thread pool to do so. This is true even if there is exactly one source. This is a pretty obvious regression. I think we'd need to ensure ordering by actually having a single thread submitting the events - we should keep a blocking queue and have the thread poll that queue.

        Show
        hshreedharan Hari Shreedharan added a comment - In general, the idea looks ok. But there is one problem with this patch - the ordering of events in the optional channel is now messed up, since we are using a thread pool to do so. This is true even if there is exactly one source. This is a pretty obvious regression. I think we'd need to ensure ordering by actually having a single thread submitting the events - we should keep a blocking queue and have the thread poll that queue.
        Hide
        jrufus Johny Rufus added a comment -

        Attached patch with the changes to deliver to optional channel asynchronously

        Show
        jrufus Johny Rufus added a comment - Attached patch with the changes to deliver to optional channel asynchronously review board link - https://reviews.apache.org/r/39035/
        Hide
        hshreedharan Hari Shreedharan added a comment -

        A good way of doing this would be to write to all channels in parallel via a fixed number of threads (to end up not using too many threads). For the ones from the required channels, wait on the futures, for the others, just don't care - submit and go away.

        Show
        hshreedharan Hari Shreedharan added a comment - A good way of doing this would be to write to all channels in parallel via a fixed number of threads (to end up not using too many threads). For the ones from the required channels, wait on the futures, for the others, just don't care - submit and go away.

          People

          • Assignee:
            jrufus Johny Rufus
            Reporter:
            jrufus Johny Rufus
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development