Kafka
  1. Kafka
  2. KAFKA-596

LogSegment.firstAppendTime not reset after truncate to

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: core
    • Labels:

      Description

      Currently, we don't reset LogSegment.firstAppendTime after the segment is truncated. What can happen is that we truncate the segment to size 0 and on next append, a new log segment with the same starting offset is rolled because the time-based rolling is triggered.

      1. kafka-596.patch
        2 kB
        Swapnil Ghike
      2. kafka-596-v2.patch
        2 kB
        Swapnil Ghike
      3. kafka-596-v3.patch
        0.9 kB
        Swapnil Ghike

        Activity

        Hide
        Jun Rao added a comment -

        Thanks for patch v3. +1. Committed to 0.8.

        Show
        Jun Rao added a comment - Thanks for patch v3. +1. Committed to 0.8.
        Hide
        Neha Narkhede added a comment -

        Good catch, Jun !

        +1 on v3

        Show
        Neha Narkhede added a comment - Good catch, Jun ! +1 on v3
        Hide
        Swapnil Ghike added a comment -

        After a discussion with Jun, reverted the conditions in maybeRoll() to the trunk version.

        Setting firstAppendTime to None in LogSegment.truncateTo() when messageSet size becomes 0 is enough to make sure that Log.maybeRoll() will not roll a new segment at the same starting offset as the last segment.

        Show
        Swapnil Ghike added a comment - After a discussion with Jun, reverted the conditions in maybeRoll() to the trunk version. Setting firstAppendTime to None in LogSegment.truncateTo() when messageSet size becomes 0 is enough to make sure that Log.maybeRoll() will not roll a new segment at the same starting offset as the last segment.
        Hide
        Swapnil Ghike added a comment -

        Yes, I see your point. Attached a new patch.

        Show
        Swapnil Ghike added a comment - Yes, I see your point. Attached a new patch.
        Hide
        Jun Rao added a comment -

        I agree that setting firstAppendTime to None in Logsement.truncateTo() is necessary. However, I don't think this is necessary in Log.maybeRoll() and Log.markedDeletedWhile(). In both cases, we are not changing the log segment. So whoever changed the segment last to make its size 0 (either through truncation or creation) would have set firstAppendTime properly. Note that maybeRoll is called on every log append. We don't want to add unnecessary overhead.

        Show
        Jun Rao added a comment - I agree that setting firstAppendTime to None in Logsement.truncateTo() is necessary. However, I don't think this is necessary in Log.maybeRoll() and Log.markedDeletedWhile(). In both cases, we are not changing the log segment. So whoever changed the segment last to make its size 0 (either through truncation or creation) would have set firstAppendTime properly. Note that maybeRoll is called on every log append. We don't want to add unnecessary overhead.
        Hide
        Swapnil Ghike added a comment -

        Actually the modification in maybeRoll() in the conditions that determine whether to roll a new segment or not, is enough for correctly fixing the issue mentioned above. The fix rolls a new segment only if the size of messageSet is > 0. So if we truncated the segment to size 0, maybeRoll() will not roll a new segment at the same starting offset.

        I kept those lines in Log.maybeRoll(), Logsement.truncateTo() and Log.markedDeletedWhile() for optimization. Setting the firstAppendTime to None whenever the size is found to be 0 will postpone the next time based roll and also will not harm correctness.

        Show
        Swapnil Ghike added a comment - Actually the modification in maybeRoll() in the conditions that determine whether to roll a new segment or not, is enough for correctly fixing the issue mentioned above. The fix rolls a new segment only if the size of messageSet is > 0. So if we truncated the segment to size 0, maybeRoll() will not roll a new segment at the same starting offset. I kept those lines in Log.maybeRoll(), Logsement.truncateTo() and Log.markedDeletedWhile() for optimization. Setting the firstAppendTime to None whenever the size is found to be 0 will postpone the next time based roll and also will not harm correctness.
        Hide
        Jun Rao added a comment -

        Thanks for the patch. A couple of comments:

        1. Log.maybeRoll(): Are the following lines needed since we are not creating a new segment?
        if (segment.messageSet.sizeInBytes == 0)
        segment.firstAppendTime = None

        2. Log.markedDeletedWhile(): Is the following line needed since we are not creating a new segment?
        view(numToDelete - 1).firstAppendTime = None

        Show
        Jun Rao added a comment - Thanks for the patch. A couple of comments: 1. Log.maybeRoll(): Are the following lines needed since we are not creating a new segment? if (segment.messageSet.sizeInBytes == 0) segment.firstAppendTime = None 2. Log.markedDeletedWhile(): Is the following line needed since we are not creating a new segment? view(numToDelete - 1).firstAppendTime = None
        Hide
        Swapnil Ghike added a comment -

        Yes, that's a pretty sharp observation. The current time based roll policy only checks whether the segment was not appended for its lifetime. This patch has three fixes:

        1. Initial assignment of firstAppendTime in Logsegment, because the non-primary constructor could potentially initialize the messageSet and set its size > 0. (I don't know if this fix will affect any other jiras.)

        2. In maybeRoll(), a new condition makes sure that roll() happens based on time only if the messageset size > 0, thus different segments cannot have identical starting offsets. It also makes sure that a new segment is not rolled if the last segment is not appended with messages until now.

        2. A segment is reborn at three places by setting its firstAppendTime to None if the message set size is 0 -
        i. Log.maybeRoll()
        ii. Logsement.truncateTo()
        iii. Log.markedDeletedWhile()

        Show
        Swapnil Ghike added a comment - Yes, that's a pretty sharp observation. The current time based roll policy only checks whether the segment was not appended for its lifetime. This patch has three fixes: 1. Initial assignment of firstAppendTime in Logsegment, because the non-primary constructor could potentially initialize the messageSet and set its size > 0. (I don't know if this fix will affect any other jiras.) 2. In maybeRoll(), a new condition makes sure that roll() happens based on time only if the messageset size > 0, thus different segments cannot have identical starting offsets. It also makes sure that a new segment is not rolled if the last segment is not appended with messages until now. 2. A segment is reborn at three places by setting its firstAppendTime to None if the message set size is 0 - i. Log.maybeRoll() ii. Logsement.truncateTo() iii. Log.markedDeletedWhile()
        Hide
        Jun Rao added a comment -

        The fix is to set LogSegment.firstAppendTime to none if we truncate the segment to size 0. However, this brings up the deeper question of how do we prevent segments with identical starting offset from being created during log roll? Maybe, we should add a check in log.roll to guard this.

        Show
        Jun Rao added a comment - The fix is to set LogSegment.firstAppendTime to none if we truncate the segment to size 0. However, this brings up the deeper question of how do we prevent segments with identical starting offset from being created during log roll? Maybe, we should add a check in log.roll to guard this.

          People

          • Assignee:
            Swapnil Ghike
            Reporter:
            Jun Rao
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 24h
              24h
              Remaining:
              Remaining Estimate - 24h
              24h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development