Kafka
  1. Kafka
  2. KAFKA-804

Incorrect index in the log of a follower

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: core
    • Labels:
      None

      Description

      In the follower, in log.append(), we use the offset of the 1st compressed message as the offset for appending index entries. This means that the index is pointing to an earlier position in the file than it should. Instead, it should use the offset of the 1st uncompressed message for appending index.

        Activity

        Hide
        Jun Rao added a comment -

        Attach a patch. It always uses the nextOffset before append as the 1st offset.

        Show
        Jun Rao added a comment - Attach a patch. It always uses the nextOffset before append as the 1st offset.
        Hide
        Neha Narkhede added a comment -

        +1

        Show
        Neha Narkhede added a comment - +1
        Hide
        Jay Kreps added a comment -

        I am not sure about this change. There are two choices to add to the index: the offset of the wrapper message which is the offset of the last message in the compressed set or the offset of the first message in the compressed set. Either of these are correct since the index entry needs to be a lower bound on the position, so this is not a bug. I do agree that we should ideally produce the same index entries on the leader and follower, but I am not sure if we should choose the first or last message. The problem with choosing the first message is that the integrity of the index can only be validated using deep iteration on the log and the index then points to a message which doesn't have the index entry given in the index which is very odd.

        This code looks very confusing to me. I can't tell if this change is making it worse or the code is already just very confusing. I recommend we leave it as is on 0.8 and fix it on trunk where that code has been cleaned up. Let's also figure out the pros and cons of which offset the index should contain.

        Show
        Jay Kreps added a comment - I am not sure about this change. There are two choices to add to the index: the offset of the wrapper message which is the offset of the last message in the compressed set or the offset of the first message in the compressed set. Either of these are correct since the index entry needs to be a lower bound on the position, so this is not a bug. I do agree that we should ideally produce the same index entries on the leader and follower, but I am not sure if we should choose the first or last message. The problem with choosing the first message is that the integrity of the index can only be validated using deep iteration on the log and the index then points to a message which doesn't have the index entry given in the index which is very odd. This code looks very confusing to me. I can't tell if this change is making it worse or the code is already just very confusing. I recommend we leave it as is on 0.8 and fix it on trunk where that code has been cleaned up. Let's also figure out the pros and cons of which offset the index should contain.
        Hide
        Jun Rao added a comment -

        Yes, the log code is a bit more complicated now to address a few recent issues that we found :
        kafka-802: flush interval based on compressed message set
        kafka-765: corrupted messages cause broker to shut down
        kafka-767: message size check needs to be done after offset assignment

        My main concern of not addressing this in 0.8 is that this makes it hard to do index verification in our system tests.

        Indexing the first message seems to make the most intuitive sense. Yes, we do need to patch DumpLogSegments to support deep message iteration, in order to do index verification.

        Show
        Jun Rao added a comment - Yes, the log code is a bit more complicated now to address a few recent issues that we found : kafka-802: flush interval based on compressed message set kafka-765: corrupted messages cause broker to shut down kafka-767: message size check needs to be done after offset assignment My main concern of not addressing this in 0.8 is that this makes it hard to do index verification in our system tests. Indexing the first message seems to make the most intuitive sense. Yes, we do need to patch DumpLogSegments to support deep message iteration, in order to do index verification.
        Hide
        Jay Kreps added a comment -

        So does this patch have the fix for all those?

        Show
        Jay Kreps added a comment - So does this patch have the fix for all those?
        Hide
        Jay Kreps added a comment - - edited
        • I only see stats being updated in the case where assignOffsets is true.
        • Please fix the logging to be human readable.

        Can you give a patch against trunk too?

        Show
        Jay Kreps added a comment - - edited I only see stats being updated in the case where assignOffsets is true. Please fix the logging to be human readable. Can you give a patch against trunk too?
        Hide
        Jay Kreps added a comment -

        Also:

        • the offsets tuple gets recreated which isnt needed
        • the count that is being done in analyzeAndValidateMessageSet is wrong and not being used. It should be removed (this can happen on trunk).
        Show
        Jay Kreps added a comment - Also: the offsets tuple gets recreated which isnt needed the count that is being done in analyzeAndValidateMessageSet is wrong and not being used. It should be removed (this can happen on trunk).
        Hide
        Jun Rao added a comment -

        Thanks for the review. Committed to 0.8 by addressing the following issues.

        1. The stat for message rate is actually meant for monitoring data rates from producers. So it's correct to only account for it in the leader during offset assignment.

        2. Fixed the log.

        3. Fixed the issue with unnecessarily recreating the tuple.

        4. Didn't fix count in analyzeAndValidateMessageSet in this patch. Will fix it when merging to trunk. Will create a separate jira for review.

        Show
        Jun Rao added a comment - Thanks for the review. Committed to 0.8 by addressing the following issues. 1. The stat for message rate is actually meant for monitoring data rates from producers. So it's correct to only account for it in the leader during offset assignment. 2. Fixed the log. 3. Fixed the issue with unnecessarily recreating the tuple. 4. Didn't fix count in analyzeAndValidateMessageSet in this patch. Will fix it when merging to trunk. Will create a separate jira for review.

          People

          • Assignee:
            Jun Rao
            Reporter:
            Jun Rao
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development