Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-528

IndexOutOfBoundsException thrown by kafka.consumer.ConsumerFetcherThread

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:

      Description

      1. Attached file system_test_1348521165.tar.gz contains all the associated log files for this test session.

      2. The system test output log can be found at: system_test_1348521165/system_test_output.log

      3. The following log message can be found at: system_test_1348521165/logs/console_consumer-5/console_consumer.log

      [2012-09-24 14:15:12,016] ERROR [ConsumerFetcherThread-console-consumer-16186_jfung-1348521311426-2c83ced7-0-1], Error due to (kafka.consumer.ConsumerFetcherThread)

      java.lang.IndexOutOfBoundsException

      at java.nio.Buffer.checkIndex(Buffer.java:512)

      at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121)

      at kafka.message.Message.magic(Message.scala:119)

      at kafka.message.Message.checksum(Message.scala:132)

      at kafka.message.Message.isValid(Message.scala:144)

      at kafka.message.ByteBufferMessageSet$$anon$1.makeNextOuter(ByteBufferMessageSet.scala:118)

      at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:149)

      at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:89)

      at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:61)

      at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:53)

      at kafka.message.ByteBufferMessageSet.shallowValidBytes(ByteBufferMessageSet.scala:54)

      at kafka.message.ByteBufferMessageSet.validBytes(ByteBufferMessageSet.scala:49)

      at kafka.consumer.PartitionTopicInfo.enqueue(PartitionTopicInfo.scala:54)

      at kafka.consumer.ConsumerFetcherThread.processPartitionData(ConsumerFetcherThread.scala:42)

      at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:103)

      at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:96)

      at scala.collection.immutable.Map$Map2.foreach(Map.scala:127)

      at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:96)

      at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)

      1. system_test_1348521165.tar.gz
        109 kB
        John Fung
      2. kafka-528.patch
        11 kB
        Jun Rao
      3. kafka-528_v2.patch
        10 kB
        Jun Rao

        Activity

        Hide
        junrao Jun Rao added a comment -

        Attach a patch. There are a couple of critical issues.

        1. FileMessageSet.truncateTo(): There is a java bug in FileChannel.truncate. The problem is that file size is set correctly, but file position is not changed accordingly. Added a fix that manually sets the file position.

        2. ZookeeperConsumerConnector: The call to get the initial offset sometimes fails since it's connected to a broker that's not the leader of a partition. Changed it to set the initial offset in ConsumerFetcherThread.

        Also fixed a few other issues:
        3. Partition: There is unnecessary ISR expansion.

        4. SyncProducer: lastConnectionTime should be set in connect().

        5. DumpLogSegment: Report if log is corrupted at the end of a file.

        6. Added some more tracing code.

        Show
        junrao Jun Rao added a comment - Attach a patch. There are a couple of critical issues. 1. FileMessageSet.truncateTo(): There is a java bug in FileChannel.truncate. The problem is that file size is set correctly, but file position is not changed accordingly. Added a fix that manually sets the file position. 2. ZookeeperConsumerConnector: The call to get the initial offset sometimes fails since it's connected to a broker that's not the leader of a partition. Changed it to set the initial offset in ConsumerFetcherThread. Also fixed a few other issues: 3. Partition: There is unnecessary ISR expansion. 4. SyncProducer: lastConnectionTime should be set in connect(). 5. DumpLogSegment: Report if log is corrupted at the end of a file. 6. Added some more tracing code.
        Hide
        jkreps Jay Kreps added a comment -

        +1 Lgtm.

        You made the logging for leaderAndIsr request to be info level. If it is info level it would be good to rephrase that in a way folks would be able to understand, otherwise we have a log full of really esoteric things that only make sense to us.

        Show
        jkreps Jay Kreps added a comment - +1 Lgtm. You made the logging for leaderAndIsr request to be info level. If it is info level it would be good to rephrase that in a way folks would be able to understand, otherwise we have a log full of really esoteric things that only make sense to us.
        Hide
        jfung John Fung added a comment - - edited

        1. Tested kafka-528.patch with rev. 1389460 and there is no data loss now. The Exception specified in this JIRA is not showing.

        2. However, there is still another exception showing in broker log (already filed under KAFKA-529) :

        [2012-09-28 23:05:54,669] ERROR [ReplicaFetcherThread-2-0-on-broker-3], Error due to (kafka.server.ReplicaFetcherThread)
        java.lang.IndexOutOfBoundsException
        at java.nio.Buffer.checkIndex(Buffer.java:514)
        at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121)
        at kafka.message.Message.magic(Message.scala:119)
        at kafka.message.Message.checksum(Message.scala:132)
        at kafka.message.Message.isValid(Message.scala:144)

        Show
        jfung John Fung added a comment - - edited 1. Tested kafka-528.patch with rev. 1389460 and there is no data loss now. The Exception specified in this JIRA is not showing. 2. However, there is still another exception showing in broker log (already filed under KAFKA-529 ) : [2012-09-28 23:05:54,669] ERROR [ReplicaFetcherThread-2-0-on-broker-3] , Error due to (kafka.server.ReplicaFetcherThread) java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkIndex(Buffer.java:514) at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121) at kafka.message.Message.magic(Message.scala:119) at kafka.message.Message.checksum(Message.scala:132) at kafka.message.Message.isValid(Message.scala:144)
        Hide
        junrao Jun Rao added a comment - - edited

        Attach patch v2. Fixed one additional bug in KafkaController.

        The bug is that Map(("a",0)>0,("a",1)>0)).map(_,_1) returns Map("a" -> 1) while we expect it to be Seq(("a,0),("a",1)). Fixed it by using .keySet instead.

        With patch v2, our system test now passes.

        Show
        junrao Jun Rao added a comment - - edited Attach patch v2. Fixed one additional bug in KafkaController. The bug is that Map(("a",0) >0,("a",1) >0)).map(_,_1) returns Map("a" -> 1) while we expect it to be Seq(("a,0),("a",1)). Fixed it by using .keySet instead. With patch v2, our system test now passes.
        Hide
        nehanarkhede Neha Narkhede added a comment -

        +1 on patch v2.

        Minor comments on logging -

        1. The debug statement in Replica says "Updated highwatermark to ". It will be good to know the topic and partition as well.
        2. The println in DumpLogSegment says "Log is corrupted from... Is corrupted". Let's fix that.

        Show
        nehanarkhede Neha Narkhede added a comment - +1 on patch v2. Minor comments on logging - 1. The debug statement in Replica says "Updated highwatermark to ". It will be good to know the topic and partition as well. 2. The println in DumpLogSegment says "Log is corrupted from... Is corrupted". Let's fix that.
        Hide
        junrao Jun Rao added a comment -

        Thanks for the review. Committed to 0.8 with some minor changes.

        Show
        junrao Jun Rao added a comment - Thanks for the review. Committed to 0.8 with some minor changes.

          People

          • Assignee:
            junrao Jun Rao
            Reporter:
            jfung John Fung
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development