Kafka
  1. Kafka
  2. KAFKA-528

IndexOutOfBoundsException thrown by kafka.consumer.ConsumerFetcherThread

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:

      Description

      1. Attached file system_test_1348521165.tar.gz contains all the associated log files for this test session.

      2. The system test output log can be found at: system_test_1348521165/system_test_output.log

      3. The following log message can be found at: system_test_1348521165/logs/console_consumer-5/console_consumer.log

      [2012-09-24 14:15:12,016] ERROR [ConsumerFetcherThread-console-consumer-16186_jfung-1348521311426-2c83ced7-0-1], Error due to (kafka.consumer.ConsumerFetcherThread)

      java.lang.IndexOutOfBoundsException

      at java.nio.Buffer.checkIndex(Buffer.java:512)

      at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121)

      at kafka.message.Message.magic(Message.scala:119)

      at kafka.message.Message.checksum(Message.scala:132)

      at kafka.message.Message.isValid(Message.scala:144)

      at kafka.message.ByteBufferMessageSet$$anon$1.makeNextOuter(ByteBufferMessageSet.scala:118)

      at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:149)

      at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:89)

      at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:61)

      at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:53)

      at kafka.message.ByteBufferMessageSet.shallowValidBytes(ByteBufferMessageSet.scala:54)

      at kafka.message.ByteBufferMessageSet.validBytes(ByteBufferMessageSet.scala:49)

      at kafka.consumer.PartitionTopicInfo.enqueue(PartitionTopicInfo.scala:54)

      at kafka.consumer.ConsumerFetcherThread.processPartitionData(ConsumerFetcherThread.scala:42)

      at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:103)

      at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:96)

      at scala.collection.immutable.Map$Map2.foreach(Map.scala:127)

      at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:96)

      at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)

      1. system_test_1348521165.tar.gz
        109 kB
        John Fung
      2. kafka-528.patch
        11 kB
        Jun Rao
      3. kafka-528_v2.patch
        10 kB
        Jun Rao

        Activity

        John Fung created issue -
        John Fung made changes -
        Field Original Value New Value
        Attachment system_test_1348521165.tar.gz [ 12546390 ]
        Hide
        Jun Rao added a comment -

        Attach a patch. There are a couple of critical issues.

        1. FileMessageSet.truncateTo(): There is a java bug in FileChannel.truncate. The problem is that file size is set correctly, but file position is not changed accordingly. Added a fix that manually sets the file position.

        2. ZookeeperConsumerConnector: The call to get the initial offset sometimes fails since it's connected to a broker that's not the leader of a partition. Changed it to set the initial offset in ConsumerFetcherThread.

        Also fixed a few other issues:
        3. Partition: There is unnecessary ISR expansion.

        4. SyncProducer: lastConnectionTime should be set in connect().

        5. DumpLogSegment: Report if log is corrupted at the end of a file.

        6. Added some more tracing code.

        Show
        Jun Rao added a comment - Attach a patch. There are a couple of critical issues. 1. FileMessageSet.truncateTo(): There is a java bug in FileChannel.truncate. The problem is that file size is set correctly, but file position is not changed accordingly. Added a fix that manually sets the file position. 2. ZookeeperConsumerConnector: The call to get the initial offset sometimes fails since it's connected to a broker that's not the leader of a partition. Changed it to set the initial offset in ConsumerFetcherThread. Also fixed a few other issues: 3. Partition: There is unnecessary ISR expansion. 4. SyncProducer: lastConnectionTime should be set in connect(). 5. DumpLogSegment: Report if log is corrupted at the end of a file. 6. Added some more tracing code.
        Jun Rao made changes -
        Attachment kafka-528.patch [ 12546531 ]
        Jun Rao made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Affects Version/s 0.8 [ 12317244 ]
        Labels bugs
        Fix Version/s 0.8 [ 12317244 ]
        Jun Rao made changes -
        Assignee Jun Rao [ junrao ]
        Jun Rao made changes -
        Priority Major [ 3 ] Blocker [ 1 ]
        Hide
        Jay Kreps added a comment -

        +1 Lgtm.

        You made the logging for leaderAndIsr request to be info level. If it is info level it would be good to rephrase that in a way folks would be able to understand, otherwise we have a log full of really esoteric things that only make sense to us.

        Show
        Jay Kreps added a comment - +1 Lgtm. You made the logging for leaderAndIsr request to be info level. If it is info level it would be good to rephrase that in a way folks would be able to understand, otherwise we have a log full of really esoteric things that only make sense to us.
        Hide
        John Fung added a comment - - edited

        1. Tested kafka-528.patch with rev. 1389460 and there is no data loss now. The Exception specified in this JIRA is not showing.

        2. However, there is still another exception showing in broker log (already filed under KAFKA-529) :

        [2012-09-28 23:05:54,669] ERROR [ReplicaFetcherThread-2-0-on-broker-3], Error due to (kafka.server.ReplicaFetcherThread)
        java.lang.IndexOutOfBoundsException
        at java.nio.Buffer.checkIndex(Buffer.java:514)
        at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121)
        at kafka.message.Message.magic(Message.scala:119)
        at kafka.message.Message.checksum(Message.scala:132)
        at kafka.message.Message.isValid(Message.scala:144)

        Show
        John Fung added a comment - - edited 1. Tested kafka-528.patch with rev. 1389460 and there is no data loss now. The Exception specified in this JIRA is not showing. 2. However, there is still another exception showing in broker log (already filed under KAFKA-529 ) : [2012-09-28 23:05:54,669] ERROR [ReplicaFetcherThread-2-0-on-broker-3] , Error due to (kafka.server.ReplicaFetcherThread) java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkIndex(Buffer.java:514) at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121) at kafka.message.Message.magic(Message.scala:119) at kafka.message.Message.checksum(Message.scala:132) at kafka.message.Message.isValid(Message.scala:144)
        Hide
        Jun Rao added a comment - - edited

        Attach patch v2. Fixed one additional bug in KafkaController.

        The bug is that Map(("a",0)>0,("a",1)>0)).map(_,_1) returns Map("a" -> 1) while we expect it to be Seq(("a,0),("a",1)). Fixed it by using .keySet instead.

        With patch v2, our system test now passes.

        Show
        Jun Rao added a comment - - edited Attach patch v2. Fixed one additional bug in KafkaController. The bug is that Map(("a",0) >0,("a",1) >0)).map(_,_1) returns Map("a" -> 1) while we expect it to be Seq(("a,0),("a",1)). Fixed it by using .keySet instead. With patch v2, our system test now passes.
        Jun Rao made changes -
        Attachment kafka-528_v2.patch [ 12547088 ]
        Hide
        Neha Narkhede added a comment -

        +1 on patch v2.

        Minor comments on logging -

        1. The debug statement in Replica says "Updated highwatermark to ". It will be good to know the topic and partition as well.
        2. The println in DumpLogSegment says "Log is corrupted from... Is corrupted". Let's fix that.

        Show
        Neha Narkhede added a comment - +1 on patch v2. Minor comments on logging - 1. The debug statement in Replica says "Updated highwatermark to ". It will be good to know the topic and partition as well. 2. The println in DumpLogSegment says "Log is corrupted from... Is corrupted". Let's fix that.
        Hide
        Jun Rao added a comment -

        Thanks for the review. Committed to 0.8 with some minor changes.

        Show
        Jun Rao added a comment - Thanks for the review. Committed to 0.8 with some minor changes.
        Jun Rao made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Jun Rao made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        17h 53m 1 Jun Rao 25/Sep/12 16:42
        Patch Available Patch Available Resolved Resolved
        4d 2h 36m 1 Jun Rao 29/Sep/12 19:19
        Resolved Resolved Closed Closed
        4s 1 Jun Rao 29/Sep/12 19:19

          People

          • Assignee:
            Jun Rao
            Reporter:
            John Fung
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development