Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-3251

Requesting committed offsets results in inconsistent results

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 0.9.0.0
    • None
    • offset manager
    • None

    Description

      Hi,

      I am using github.com/Shopify/sarama to retrieve the committed offsets for a high-volume topic, but the bug seems to be actually originating in Kafka itself.

      I have written a little test to query the offsets of all partitions of one topic, every second. The request looks like this:

      OffsetFetchRequest{
        ConsumerGroup: "my-group-name", 
        Version: 1,
        TopicPartitions: []TopicPartition{
           {TopicName: "logs", Partitions: []int32{0,1,2,3,4,5,6,7}
        }
      }
      

      For most of the time, the responses are correct, but every 10 minutes or so, there is a little glitch. I am not familiar with the Kafka internals, but it looks like a little race. Here's my log output:

      ...
      
      2016/02/19 09:48:10 topic=logs partition=00 error=0 offset=206567925
      2016/02/19 09:48:10 topic=logs partition=01 error=0 offset=206671019
      2016/02/19 09:48:10 topic=logs partition=02 error=0 offset=206567995
      2016/02/19 09:48:10 topic=logs partition=03 error=0 offset=205785315
      2016/02/19 09:48:10 topic=logs partition=04 error=0 offset=206526677
      2016/02/19 09:48:10 topic=logs partition=05 error=0 offset=206713764
      2016/02/19 09:48:10 topic=logs partition=06 error=0 offset=206524006
      2016/02/19 09:48:10 topic=logs partition=07 error=0 offset=206629121
      
      2016/02/19 09:48:11 topic=logs partition=00 error=0 offset=206572870
      2016/02/19 09:48:11 topic=logs partition=01 error=0 offset=206675966
      2016/02/19 09:48:11 topic=logs partition=02 error=0 offset=206573267
      2016/02/19 09:48:11 topic=logs partition=03 error=0 offset=205790613
      2016/02/19 09:48:11 topic=logs partition=04 error=0 offset=206531841
      2016/02/19 09:48:11 topic=logs partition=05 error=0 offset=206718513
      2016/02/19 09:48:11 topic=logs partition=06 error=0 offset=206529762
      2016/02/19 09:48:11 topic=logs partition=07 error=0 offset=206634037
      
      2016/02/19 09:48:12 topic=logs partition=00 error=0 offset=-1
      2016/02/19 09:48:12 topic=logs partition=01 error=0 offset=-1
      2016/02/19 09:48:12 topic=logs partition=02 error=0 offset=-1
      2016/02/19 09:48:12 topic=logs partition=03 error=0 offset=-1
      2016/02/19 09:48:12 topic=logs partition=04 error=0 offset=-1
      2016/02/19 09:48:12 topic=logs partition=05 error=0 offset=-1
      2016/02/19 09:48:12 topic=logs partition=06 error=0 offset=-1
      2016/02/19 09:48:12 topic=logs partition=07 error=0 offset=-1
      
      2016/02/19 09:48:13 topic=logs partition=00 error=0 offset=-1
      2016/02/19 09:48:13 topic=logs partition=01 error=0 offset=206686020
      2016/02/19 09:48:13 topic=logs partition=02 error=0 offset=206583861
      2016/02/19 09:48:13 topic=logs partition=03 error=0 offset=205800480
      2016/02/19 09:48:13 topic=logs partition=04 error=0 offset=206542733
      2016/02/19 09:48:13 topic=logs partition=05 error=0 offset=206728251
      2016/02/19 09:48:13 topic=logs partition=06 error=0 offset=206534794
      2016/02/19 09:48:13 topic=logs partition=07 error=0 offset=206643853
      
      2016/02/19 09:48:14 topic=logs partition=00 error=0 offset=206584533
      2016/02/19 09:48:14 topic=logs partition=01 error=0 offset=206690275
      2016/02/19 09:48:14 topic=logs partition=02 error=0 offset=206588902
      2016/02/19 09:48:14 topic=logs partition=03 error=0 offset=205805413
      2016/02/19 09:48:14 topic=logs partition=04 error=0 offset=206542733
      2016/02/19 09:48:14 topic=logs partition=05 error=0 offset=206733144
      2016/02/19 09:48:14 topic=logs partition=06 error=0 offset=206540275
      2016/02/19 09:48:14 topic=logs partition=07 error=0 offset=206649392
      ...
      

      As you can see, the returned error code is 0 and there is no obvious reason why the returned offsets are suddenly wrong/blank.

      I have also added some debugging to our offset committer to make absolutely sure the numbers we are sending are absolutely correct and they are.

      Any help is greatly appreciated!

      Attachments

        Activity

          People

            hachikuji Jason Gustafson
            dimitrij.denissenko@blacksquaremedia.com Dimitrij Denissenko
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: