Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2781

A Kafka Channel defined as parseAsFlumeEvent=false cannot be correctly used by a Flume source

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.0
    • Fix Version/s: 1.7.0
    • Component/s: None
    • Labels:
    • Release Note:
      When Flume writes to a channel defined as parseAsFlumeEvent=false, use text instead of Avro

      Description

      When a Kafka channel is configured as parseAsFlumeEvent=false, the channel will read events from the topic as text instead of serialized Avro Flume events.
      This is useful so Flume can read from an existing Kafka topic, where other Kafka clients publish as text.

      However, if you use a Flume source on that channel, it will still write the events as Avro so it will create an inconsistency and those events will fail to be read correctly.

      Also, this would allow a Flume source to write to a Kafka channel and any Kafka subscriber to listen to Flume events passing through without binary dependencies.

      1. FLUME-2781.patch
        4 kB
        Gonzalo Herreros

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user gherreros opened a pull request:

        https://github.com/apache/flume/pull/24

        FLUME-2781 proposed implementation

        Implemented a tiny change so putting to the channel is consistent with take, so if parseAsFlumeEvent is false it uses the event body text instead of an Avro Flume event object.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/gherreros/flume trunk

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/flume/pull/24.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #24


        commit 2c02cfa484460056ec4d040dc2ec568e19fa916b
        Author: gherreros <gonzalo_herreros@mastercard.com>
        Date: 2015-09-01T13:07:55Z

        FLUME-2781 When the Kafka channel is configured as
        parseAsFlumeEvent=false, any writes to the topic are done as plain text
        (so the events can be read correctly)


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user gherreros opened a pull request: https://github.com/apache/flume/pull/24 FLUME-2781 proposed implementation Implemented a tiny change so putting to the channel is consistent with take, so if parseAsFlumeEvent is false it uses the event body text instead of an Avro Flume event object. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gherreros/flume trunk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flume/pull/24.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #24 commit 2c02cfa484460056ec4d040dc2ec568e19fa916b Author: gherreros <gonzalo_herreros@mastercard.com> Date: 2015-09-01T13:07:55Z FLUME-2781 When the Kafka channel is configured as parseAsFlumeEvent=false, any writes to the topic are done as plain text (so the events can be read correctly)
        Hide
        gherreros Gonzalo Herreros added a comment -

        Proposed implementation of the issue

        Show
        gherreros Gonzalo Herreros added a comment - Proposed implementation of the issue
        Hide
        gherreros Gonzalo Herreros added a comment -

        Simple fix and the corresponding unit test

        Show
        gherreros Gonzalo Herreros added a comment - Simple fix and the corresponding unit test
        Hide
        hshreedharan Hari Shreedharan added a comment -

        This looks good. There a few formatting issues - we use 2 space indents everywhere - the patch uses a mix of 2 and 4 spaces. This code:

        if (parseAsFlumeEvent){
        

        should be:

        if (parseAsFlumeEvent) {
        
        Show
        hshreedharan Hari Shreedharan added a comment - This looks good. There a few formatting issues - we use 2 space indents everywhere - the patch uses a mix of 2 and 4 spaces. This code: if (parseAsFlumeEvent){ should be: if (parseAsFlumeEvent) {
        Hide
        gherreros Gonzalo Herreros added a comment -

        Updated formatting

        Show
        gherreros Gonzalo Herreros added a comment - Updated formatting
        Hide
        hshreedharan Hari Shreedharan added a comment -

        +1. LGTM. I am running the tests now. Will commit once the tests pass.

        Show
        hshreedharan Hari Shreedharan added a comment - +1. LGTM. I am running the tests now. Will commit once the tests pass.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit faad35801b24b9f0ca34d8b86f28dded468d73b8 in flume's branch refs/heads/flume-1.7 from Hari Shreedharan
        [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=faad358 ]

        FLUME-2781. Kafka Channel with parseAsFlumeEvent=true should write data as is, not as flume events.

        (Gonzalo Herreros via Hari)

        Show
        jira-bot ASF subversion and git services added a comment - Commit faad35801b24b9f0ca34d8b86f28dded468d73b8 in flume's branch refs/heads/flume-1.7 from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=faad358 ] FLUME-2781 . Kafka Channel with parseAsFlumeEvent=true should write data as is, not as flume events. (Gonzalo Herreros via Hari)
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Committed! Thanks Gonzalo Herreros!

        Show
        hshreedharan Hari Shreedharan added a comment - Committed! Thanks Gonzalo Herreros !
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 67ed62aa18df3675b68369d0d00c8f0dcbdfb970 in flume's branch refs/heads/trunk from Hari Shreedharan
        [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=67ed62a ]

        FLUME-2781. Kafka Channel with parseAsFlumeEvent=true should write data as is, not as flume events.

        (Gonzalo Herreros via Hari)

        Show
        jira-bot ASF subversion and git services added a comment - Commit 67ed62aa18df3675b68369d0d00c8f0dcbdfb970 in flume's branch refs/heads/trunk from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=67ed62a ] FLUME-2781 . Kafka Channel with parseAsFlumeEvent=true should write data as is, not as flume events. (Gonzalo Herreros via Hari)
        Hide
        hudson Hudson added a comment -

        UNSTABLE: Integrated in Flume-trunk-hbase-1 #129 (See https://builds.apache.org/job/Flume-trunk-hbase-1/129/)
        FLUME-2781. Kafka Channel with parseAsFlumeEvent=true should write data (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=67ed62aa18df3675b68369d0d00c8f0dcbdfb970)

        • flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java
        • flume-ng-channels/flume-kafka-channel/src/test/java/org/apache/flume/channel/kafka/TestKafkaChannel.java
        Show
        hudson Hudson added a comment - UNSTABLE: Integrated in Flume-trunk-hbase-1 #129 (See https://builds.apache.org/job/Flume-trunk-hbase-1/129/ ) FLUME-2781 . Kafka Channel with parseAsFlumeEvent=true should write data (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=67ed62aa18df3675b68369d0d00c8f0dcbdfb970 ) flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java flume-ng-channels/flume-kafka-channel/src/test/java/org/apache/flume/channel/kafka/TestKafkaChannel.java
        Hide
        roshan_naik Roshan Naik added a comment -

        Gonzalo Herreros this is a very useful feature!

        Don't see any note in the UserGuide about this... so needed clarifcation...

        Setting parseAsFlumeEvent=false will cause events to be written as-is into the Kafka topic without the FlumeEvent wrapper right ?

        Show
        roshan_naik Roshan Naik added a comment - Gonzalo Herreros this is a very useful feature! Don't see any note in the UserGuide about this... so needed clarifcation... Setting parseAsFlumeEvent= false will cause events to be written as-is into the Kafka topic without the FlumeEvent wrapper right ?
        Hide
        gherreros Gonzalo Herreros added a comment -

        That's correct. You can use KafkaConsoleConsumer to verify the difference.
        With the flag set to true (default), each line will have a couple of strange characters before each line (it's avro) while when set to false you only get the event body as it is.

        Show
        gherreros Gonzalo Herreros added a comment - That's correct. You can use KafkaConsoleConsumer to verify the difference. With the flag set to true (default), each line will have a couple of strange characters before each line (it's avro) while when set to false you only get the event body as it is.
        Hide
        roshan_naik Roshan Naik added a comment -

        Thanks for confirming Gonzalo Herreros

        I got a confused when i saw this stmt in some of the automated comments here in this jira.

        "Kafka Channel with parseAsFlumeEvent=true should write data as is, not as flume events." .

        which i believe is exactly the opposite of what is intended.

        Show
        roshan_naik Roshan Naik added a comment - Thanks for confirming Gonzalo Herreros I got a confused when i saw this stmt in some of the automated comments here in this jira. "Kafka Channel with parseAsFlumeEvent=true should write data as is, not as flume events." . which i believe is exactly the opposite of what is intended.

          People

          • Assignee:
            gherreros Gonzalo Herreros
            Reporter:
            gherreros Gonzalo Herreros
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development