Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: None
    • Labels:
      None

      Description

      Kafka source, sink and channel should have metrics. This will help us track down possible issues or performance problems.

      Here are the metrics I came up with:

      kafka.next.time - Time spent waiting for events from Kafka (source and channel)
      kafka.send.time - Time spent sending events (channel and sink)
      kafka.commit.time - Time spent committing (source and channel)
      events.sent - Number of events sent to Kafka (sink and channel)
      events.read - Number of events read from Kafka (channel and source)
      events.rollback - Number of event rolled back (channel) or number of rollback calls (sink)
      kafka.empty - Number of times backing off due to empty kafka topic (source)

      1. FLUME-2562.1.patch
        26 kB
        Gwen Shapira
      2. FLUME-2562.2.patch
        26 kB
        Gwen Shapira

        Activity

        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Flume-trunk-hbase-98 #64 (See https://builds.apache.org/job/Flume-trunk-hbase-98/64/)
        FLUME-2562. Add metrics for Kafka Source, Kafka Sink and Kafka Channel. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=1d9bab6760df38e538705a74dd599de03129777b)

        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaChannelCounter.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSourceCounterMBean.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSinkCounter.java
        • flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSourceCounter.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSinkCounterMBean.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaChannelCounterMBean.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/SourceCounter.java
        • flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/ChannelCounter.java
        • flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/SinkCounter.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Flume-trunk-hbase-98 #64 (See https://builds.apache.org/job/Flume-trunk-hbase-98/64/ ) FLUME-2562 . Add metrics for Kafka Source, Kafka Sink and Kafka Channel. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=1d9bab6760df38e538705a74dd599de03129777b ) flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaChannelCounter.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSourceCounterMBean.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSinkCounter.java flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSourceCounter.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSinkCounterMBean.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaChannelCounterMBean.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/SourceCounter.java flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/ChannelCounter.java flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/SinkCounter.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in flume-trunk #707 (See https://builds.apache.org/job/flume-trunk/707/)
        FLUME-2562. Add metrics for Kafka Source, Kafka Sink and Kafka Channel. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=1d9bab6760df38e538705a74dd599de03129777b)

        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/ChannelCounter.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSinkCounterMBean.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSinkCounter.java
        • flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSourceCounterMBean.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaChannelCounter.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSourceCounter.java
        • flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaChannelCounterMBean.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/SourceCounter.java
        • flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java
        • flume-ng-core/src/main/java/org/apache/flume/instrumentation/SinkCounter.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in flume-trunk #707 (See https://builds.apache.org/job/flume-trunk/707/ ) FLUME-2562 . Add metrics for Kafka Source, Kafka Sink and Kafka Channel. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=1d9bab6760df38e538705a74dd599de03129777b ) flume-ng-core/src/main/java/org/apache/flume/instrumentation/ChannelCounter.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSinkCounterMBean.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSinkCounter.java flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSourceCounterMBean.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaChannelCounter.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaSourceCounter.java flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/kafka/KafkaChannelCounterMBean.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/SourceCounter.java flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java flume-ng-core/src/main/java/org/apache/flume/instrumentation/SinkCounter.java
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Committed! Thanks Gwen!

        Show
        hshreedharan Hari Shreedharan added a comment - Committed! Thanks Gwen!
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 3eae844f366f770979b7f0cbf4a396f09465f7e4 in flume's branch refs/heads/flume-1.6 from Hari Shreedharan
        [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=3eae844 ]

        FLUME-2562. Add metrics for Kafka Source, Kafka Sink and Kafka Channel.

        (Gwen Shapira via Hari)

        Show
        jira-bot ASF subversion and git services added a comment - Commit 3eae844f366f770979b7f0cbf4a396f09465f7e4 in flume's branch refs/heads/flume-1.6 from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=3eae844 ] FLUME-2562 . Add metrics for Kafka Source, Kafka Sink and Kafka Channel. (Gwen Shapira via Hari)
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1d9bab6760df38e538705a74dd599de03129777b in flume's branch refs/heads/trunk from Hari Shreedharan
        [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=1d9bab6 ]

        FLUME-2562. Add metrics for Kafka Source, Kafka Sink and Kafka Channel.

        (Gwen Shapira via Hari)

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1d9bab6760df38e538705a74dd599de03129777b in flume's branch refs/heads/trunk from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=1d9bab6 ] FLUME-2562 . Add metrics for Kafka Source, Kafka Sink and Kafka Channel. (Gwen Shapira via Hari)
        Hide
        gwenshap Gwen Shapira added a comment -

        Thanks

        Show
        gwenshap Gwen Shapira added a comment - Thanks
        Hide
        hshreedharan Hari Shreedharan added a comment -

        +1. Committing!

        Show
        hshreedharan Hari Shreedharan added a comment - +1. Committing!
        Hide
        gwenshap Gwen Shapira added a comment -

        Attaching patch after rebase

        Show
        gwenshap Gwen Shapira added a comment - Attaching patch after rebase
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Ping

        Show
        hshreedharan Hari Shreedharan added a comment - Ping
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Gwen - This patch no longer applies. Could you please rebase?

        Show
        hshreedharan Hari Shreedharan added a comment - Gwen - This patch no longer applies. Could you please rebase?
        Hide
        hshreedharan Hari Shreedharan added a comment -

        +1. This looks good. I will run tests later today and commit this.

        Show
        hshreedharan Hari Shreedharan added a comment - +1. This looks good. I will run tests later today and commit this.
        Hide
        paliwalashish Ashish Paliwal added a comment - - edited

        Any chances of getting this patch getting in trunk?

        Show
        paliwalashish Ashish Paliwal added a comment - - edited Any chances of getting this patch getting in trunk?
        Hide
        paliwalashish Ashish Paliwal added a comment -

        The patch looks good as is. What I meant was, if we can find a reusable way to do expose Counters via JMX , all other Sources/Channels/Sinks implementations don't need to create their own MBeans. Not necessarily part of this patch, may be a different JIRA by itself. Just a thought

        Show
        paliwalashish Ashish Paliwal added a comment - The patch looks good as is. What I meant was, if we can find a reusable way to do expose Counters via JMX , all other Sources/Channels/Sinks implementations don't need to create their own MBeans. Not necessarily part of this patch, may be a different JIRA by itself. Just a thought
        Hide
        gwenshap Gwen Shapira added a comment -

        Ashish Paliwal - I am afraid I do not fully understand your comment. Perhaps I'm missing some context.

        Are you asking to wait with this patch because you are working on a change that will make this patch simpler? If so, can you point me at where the rest of the work is proceeding?

        Show
        gwenshap Gwen Shapira added a comment - Ashish Paliwal - I am afraid I do not fully understand your comment. Perhaps I'm missing some context. Are you asking to wait with this patch because you are working on a change that will make this patch simpler? If so, can you point me at where the rest of the work is proceeding?
        Hide
        paliwalashish Ashish Paliwal added a comment -

        Gwen Shapira Can we find a generic way to expose the Counters? Extending Counters for each specific class would be a lot of boiler plate code. The extended classes have one of the change needed. This is good to support any number of Counters. Only thing needed is change in JMX infra to expose this. Or we can commit this and later generalise the changes. The codahale patch I am working on follows the same path. So if this goes early, I would like to rebase the patch.

        Show
        paliwalashish Ashish Paliwal added a comment - Gwen Shapira Can we find a generic way to expose the Counters? Extending Counters for each specific class would be a lot of boiler plate code. The extended classes have one of the change needed. This is good to support any number of Counters. Only thing needed is change in JMX infra to expose this. Or we can commit this and later generalise the changes. The codahale patch I am working on follows the same path. So if this goes early, I would like to rebase the patch.
        Hide
        gwenshap Gwen Shapira added a comment -

        Example of new metrics:

        "CHANNEL.channel1":{"EventPutSuccessCount":"98","ChannelFillPercentage":"1.7976931348623157E308","KafkaEventGetTimer":"2","Type":"CHANNEL","KafkaEventSendTimer":"565","KafkaCommitTimer":"1142","EventTakeSuccessCount":"95","StopTime":"0","ChannelSize":"0","EventPutAttemptCount":"0","StartTime":"1416893965625","RollbackCount":"0","ChannelCapacity":"0","EventTakeAttemptCount":"0”}}

        "SOURCE.source1":

        {"*KafkaEventGetTimer*":"7377","OpenConnectionCount":"0","Type":"SOURCE","AppendBatchAcceptedCount":"0","AppendBatchReceivedCount":"0","EventAcceptedCount":"30","AppendReceivedCount":"0","StopTime":"0","StartTime":"1416953795734","EventReceivedCount":"30","*KafkaCommitTimer*":"1129","AppendAcceptedCount":"0”}

        ,

        {"SINK.sink1":

        {"Type":"SINK","ConnectionClosedCount":"0","EventDrainSuccessCount":"437","*KafkaEventSendTimer*":"2144","ConnectionFailedCount":"0","BatchCompleteCount":"0","EventDrainAttemptCount":"0","ConnectionCreatedCount":"0","BatchEmptyCount":"0","StopTime":"0","*RollbackCount*":"0","StartTime":"1416960016259","BatchUnderflowCount":"0”}

        ,

        Show
        gwenshap Gwen Shapira added a comment - Example of new metrics: "CHANNEL.channel1":{"EventPutSuccessCount":"98","ChannelFillPercentage":"1.7976931348623157E308"," KafkaEventGetTimer ":"2","Type":"CHANNEL"," KafkaEventSendTimer ":"565"," KafkaCommitTimer ":"1142","EventTakeSuccessCount":"95","StopTime":"0","ChannelSize":"0","EventPutAttemptCount":"0","StartTime":"1416893965625"," RollbackCount ":"0","ChannelCapacity":"0","EventTakeAttemptCount":"0”}} "SOURCE.source1": {"*KafkaEventGetTimer*":"7377","OpenConnectionCount":"0","Type":"SOURCE","AppendBatchAcceptedCount":"0","AppendBatchReceivedCount":"0","EventAcceptedCount":"30","AppendReceivedCount":"0","StopTime":"0","StartTime":"1416953795734","EventReceivedCount":"30","*KafkaCommitTimer*":"1129","AppendAcceptedCount":"0”} , {"SINK.sink1": {"Type":"SINK","ConnectionClosedCount":"0","EventDrainSuccessCount":"437","*KafkaEventSendTimer*":"2144","ConnectionFailedCount":"0","BatchCompleteCount":"0","EventDrainAttemptCount":"0","ConnectionCreatedCount":"0","BatchEmptyCount":"0","StopTime":"0","*RollbackCount*":"0","StartTime":"1416960016259","BatchUnderflowCount":"0”} ,
        Hide
        gwenshap Gwen Shapira added a comment -

        Version of the patch that uses MonitoredCounterGroup hierarchy.

        I verified that all components (source, sink, channel) work and that they report metrics through the http port.

        Show
        gwenshap Gwen Shapira added a comment - Version of the patch that uses MonitoredCounterGroup hierarchy. I verified that all components (source, sink, channel) work and that they report metrics through the http port.
        Hide
        paliwalashish Ashish Paliwal added a comment -

        IMHO we need to make some functions in MonitoredCounterGroup visible to using classes (extended classes internally use them). For JMX, we can expose the counterMap using DynamicMBean or some other way.

        CounterGroup class is used across the code (19 usage), may be we can either plugin the code into Metrics system or phase it out.

        Show
        paliwalashish Ashish Paliwal added a comment - IMHO we need to make some functions in MonitoredCounterGroup visible to using classes (extended classes internally use them). For JMX, we can expose the counterMap using DynamicMBean or some other way. CounterGroup class is used across the code (19 usage), may be we can either plugin the code into Metrics system or phase it out.
        Hide
        otis Otis Gospodnetic added a comment -

        I didn't look at the MBeans structure, but I'd like to point out the work we did for Kafka over in KAFKA-1481 where we had to completely rework MBeans naming to make them programmatically parseable with tools like SPM (http://sematext.com/spm/) . We also added Kafka version to MBeans, so tools can programmatically see which version they are talking to, because MBeans unfortunately tend to change between releases. I'm pointing this out here in hopes of Flume avoiding these mistakes.

        Show
        otis Otis Gospodnetic added a comment - I didn't look at the MBeans structure, but I'd like to point out the work we did for Kafka over in KAFKA-1481 where we had to completely rework MBeans naming to make them programmatically parseable with tools like SPM ( http://sematext.com/spm/ ) . We also added Kafka version to MBeans, so tools can programmatically see which version they are talking to, because MBeans unfortunately tend to change between releases. I'm pointing this out here in hopes of Flume avoiding these mistakes.
        Hide
        hshreedharan Hari Shreedharan added a comment -

        Extending is fine. In fact, I think we could make the classes themselves more extensible so we can add custom metrics. The only issue is that you need to add methods for this class in an MBean interface (See SourceCounterMBean class), so JMX can actually poll these metrics.

        Show
        hshreedharan Hari Shreedharan added a comment - Extending is fine. In fact, I think we could make the classes themselves more extensible so we can add custom metrics. The only issue is that you need to add methods for this class in an MBean interface (See SourceCounterMBean class), so JMX can actually poll these metrics.
        Hide
        gwenshap Gwen Shapira added a comment -

        Removed the patch since we want one that uses the MonitoredCounterGroup

        Show
        gwenshap Gwen Shapira added a comment - Removed the patch since we want one that uses the MonitoredCounterGroup
        Hide
        gwenshap Gwen Shapira added a comment -

        Looks like I got the wrong counters here

        I used CounterGroup where I probably should have used SourceCounter, SinkCounter and ChannelCounter which are exposed via http.

        I assume that extending those classes to add some Kafka-specific metrics is acceptable?
        I'm asking because it looks like no one extended them before...

        Show
        gwenshap Gwen Shapira added a comment - Looks like I got the wrong counters here I used CounterGroup where I probably should have used SourceCounter, SinkCounter and ChannelCounter which are exposed via http. I assume that extending those classes to add some Kafka-specific metrics is acceptable? I'm asking because it looks like no one extended them before...
        Hide
        paliwalashish Ashish Paliwal added a comment -

        AFAIK, no direct way of testing this. These metrics cannot be exposed via JMX or by other means, as they rely on JMX internally (meaning we could reuse existing test cases). One way I can think of is to get hold of counterGroup and then asserting on it. Spring has ReflectionTestUtils, there might be something similar in junit as well

        Show
        paliwalashish Ashish Paliwal added a comment - AFAIK, no direct way of testing this. These metrics cannot be exposed via JMX or by other means, as they rely on JMX internally (meaning we could reuse existing test cases). One way I can think of is to get hold of counterGroup and then asserting on it. Spring has ReflectionTestUtils, there might be something similar in junit as well
        Hide
        gwenshap Gwen Shapira added a comment -

        I tested that unit-tests are passing.

        I'm not sure how to tests that metrics are indeed collected as expected. Advice will be appreciated

        Show
        gwenshap Gwen Shapira added a comment - I tested that unit-tests are passing. I'm not sure how to tests that metrics are indeed collected as expected. Advice will be appreciated

          People

          • Assignee:
            gwenshap Gwen Shapira
            Reporter:
            gwenshap Gwen Shapira
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development