Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-4051

Strange behavior during rebalance when turning the OS clock back

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.10.0.0
    • 0.10.1.0
    • consumer
    • None
    • OS: Ubuntu 14.04 - 64bits

    Description

      If a rebalance is performed after turning the OS clock back, then the kafka server enters in a loop and the rebalance cannot be completed until the system returns to the previous date/hour.

      Steps to Reproduce:

      • Start a consumer for TOPIC_NAME with group id GROUP_NAME. It will be owner of all the partitions.
      • Turn the system (OS) clock back. For instance 1 hour.
      • Start a new consumer for TOPIC_NAME using the same group id, it will force a rebalance.

      After these actions the kafka server logs constantly display the messages below, and after a while both consumers do not receive more packages. This condition lasts at least the time that the clock went back, for this example 1 hour, and finally after this time kafka comes back to work.

      [2016-08-08 11:30:23,023] INFO [GroupCoordinator 0]: Preparing to restabilize group GROUP_NAME with old generation 2 (kafka.coordinator.GroupCoordinator)
      [2016-08-08 11:30:23,025] INFO [GroupCoordinator 0]: Stabilized group GROUP_NAME generation 3 (kafka.coordinator.GroupCoordinator)
      [2016-08-08 11:30:23,027] INFO [GroupCoordinator 0]: Preparing to restabilize group GROUP_NAME with old generation 3 (kafka.coordinator.GroupCoordinator)
      [2016-08-08 11:30:23,029] INFO [GroupCoordinator 0]: Group GROUP_NAME generation 3 is dead and removed (kafka.coordinator.GroupCoordinator)
      [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Preparing to restabilize group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
      [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Stabilized group GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
      [2016-08-08 11:30:23,033] INFO [GroupCoordinator 0]: Preparing to restabilize group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
      [2016-08-08 11:30:23,034] INFO [GroupCoordinator 0]: Group GROUP generation 1 is dead and removed (kafka.coordinator.GroupCoordinator)
      [2016-08-08 11:30:23,043] INFO [GroupCoordinator 0]: Preparing to restabilize group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
      [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Stabilized group GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
      [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Preparing to restabilize group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
      [2016-08-08 11:30:23,045] INFO [GroupCoordinator 0]: Group GROUP_NAME generation 1 is dead and removed (kafka.coordinator.GroupCoordinator)

      Due to the fact that some systems could have enabled NTP or an administrator option to change the system clock (date/time) it's important to do it safely, currently the only way to do it safely is following the next steps:

      1- Tear down the Kafka server.
      2- Change the date/time
      3- Tear up the Kafka server.

      But, this approach can be done only if the change was performed by the administrator, not for NTP. Also in many systems turning down the Kafka server might cause the INFORMATION TO BE LOST.

      Attachments

        Issue Links

          Activity

            ijuma Ismael Juma added a comment -

            As discussed in the thread, Kafka uses System.currentTimeMillis() in a number of places, which means that changing the system clock backwards is bound to cause issues. I wouldn't be surprised if there are other problems in addition to the one mentioned here.

            ijuma Ismael Juma added a comment - As discussed in the thread, Kafka uses System.currentTimeMillis() in a number of places, which means that changing the system clock backwards is bound to cause issues. I wouldn't be surprised if there are other problems in addition to the one mentioned here.
            rsivaram Rajini Sivaram added a comment -

            I think the issue is around the handling of timer tasks in Kafka. While task expiry is set using System.currentTimeMillis which can move backwards (as reported in this JIRA), the internal timers in the TimingWheel in Kafka used to handle expiry is a monotonically increasing timer that starts with System.currentTimeMillis. This mismatch causes expiry of tasks until System.currentTimeMillis catches up with the internal timer.

            As ijuma has pointed out on the mailing list, Kafka uses System.currentTimeMillis in a lot of places and switching to System.nanoTime everywhere could impact performance. We have a few choices on fixing this JIRA (in increasing order of complexity)

            1. We could switch over to System.nanoTime for TimerTasks alone to fix the issue with delayed tasks reported here
            2. It may be possible to change the timer implementation to recover better when wall clock time moves backwards
            3. Replace System.currentTimeMillis with System.nanoTime in time comparisons throughout Kafka code

            I am inclined to do 1) and run performance tests, but am interested in what others think.

            rsivaram Rajini Sivaram added a comment - I think the issue is around the handling of timer tasks in Kafka. While task expiry is set using System.currentTimeMillis which can move backwards (as reported in this JIRA), the internal timers in the TimingWheel in Kafka used to handle expiry is a monotonically increasing timer that starts with System.currentTimeMillis . This mismatch causes expiry of tasks until System.currentTimeMillis catches up with the internal timer. As ijuma has pointed out on the mailing list, Kafka uses System.currentTimeMillis in a lot of places and switching to System.nanoTime everywhere could impact performance. We have a few choices on fixing this JIRA (in increasing order of complexity) We could switch over to System.nanoTime for TimerTasks alone to fix the issue with delayed tasks reported here It may be possible to change the timer implementation to recover better when wall clock time moves backwards Replace System.currentTimeMillis with System.nanoTime in time comparisons throughout Kafka code I am inclined to do 1) and run performance tests, but am interested in what others think.
            rsivaram Rajini Sivaram added a comment -

            ijuma As you have pointed out, there are inevitably going to be issues since Kafka uses System.currentTimeMillis in so many places. But typically, you would expect a single clock change to cause one expiry and thereafter continue to work with the changed timer (eg. producer metadata get expires and retry works since it is on the updated clock). The issue in this JIRA is that the broker doesn't recover until the wall clock time reaches the previously set time. I imagine changing the clock back by an hour is an uncommon scenario, but the impact is quite big if it does happen. If we are fixing this issue, it will be useful to have a system test to check that Kafka continues to function after a major clock change.

            rsivaram Rajini Sivaram added a comment - ijuma As you have pointed out, there are inevitably going to be issues since Kafka uses System.currentTimeMillis in so many places. But typically, you would expect a single clock change to cause one expiry and thereafter continue to work with the changed timer (eg. producer metadata get expires and retry works since it is on the updated clock). The issue in this JIRA is that the broker doesn't recover until the wall clock time reaches the previously set time. I imagine changing the clock back by an hour is an uncommon scenario, but the impact is quite big if it does happen. If we are fixing this issue, it will be useful to have a system test to check that Kafka continues to function after a major clock change.
            ijuma Ismael Juma added a comment -

            rsivaram, option 1 seems worth a try. Note that it would have to be tested on Linux and Mac to verify that it fixes the underlying issue (it would be good to test on Windows too, but we have known issues in that platform so should not be a blocker, I guess). The JDK implementation relies on the OS (and with some JDK-level workarounds sometimes), so it's platform dependent.

            gwenshap had some concerns on the usage of System.nanoTime with regards to the impact on loaded systems so it would be good to get her thoughts too.

            ijuma Ismael Juma added a comment - rsivaram , option 1 seems worth a try. Note that it would have to be tested on Linux and Mac to verify that it fixes the underlying issue (it would be good to test on Windows too, but we have known issues in that platform so should not be a blocker, I guess). The JDK implementation relies on the OS (and with some JDK-level workarounds sometimes), so it's platform dependent. gwenshap had some concerns on the usage of System.nanoTime with regards to the impact on loaded systems so it would be good to get her thoughts too.
            gwenshap Gwen Shapira added a comment -

            I was definitely concerned about complete replacement throughout the code. Especially when there was no justification in most places.

            I think that going with option 1, doing performance comparisons and documenting exactly why we use nanotime there will be good.

            gwenshap Gwen Shapira added a comment - I was definitely concerned about complete replacement throughout the code. Especially when there was no justification in most places. I think that going with option 1, doing performance comparisons and documenting exactly why we use nanotime there will be good.
            rsivaram Rajini Sivaram added a comment -

            ijuma gwenshap Thank you both for the feedback. I will try it out locally, and run performance tests. I can test on Linux and Mac.

            rsivaram Rajini Sivaram added a comment - ijuma gwenshap Thank you both for the feedback. I will try it out locally, and run performance tests. I can test on Linux and Mac.
            onurkaraman Onur Karaman added a comment -

            Is there a legitimate reason for rolling back your clock, especially by an hour? Unless there's a legitimate reason, I don't see the value in investigating a solution further.

            Based on some quick reading, when the delta in times grows large (it was hard for me to tell if the threshold is 128 ms or 128 s, the docs mention both), ntpd may either step the time back or do slew-based correction depending on how you configure it.

            warning: I know next to nothing about ntp!

            reference: http://doc.ntp.org/4.2.4/ntpd.html#op

            onurkaraman Onur Karaman added a comment - Is there a legitimate reason for rolling back your clock, especially by an hour? Unless there's a legitimate reason, I don't see the value in investigating a solution further. Based on some quick reading, when the delta in times grows large (it was hard for me to tell if the threshold is 128 ms or 128 s, the docs mention both), ntpd may either step the time back or do slew-based correction depending on how you configure it. warning: I know next to nothing about ntp! reference: http://doc.ntp.org/4.2.4/ntpd.html#op
            githubbot ASF GitHub Bot added a comment -

            GitHub user rajinisivaram opened a pull request:

            https://github.com/apache/kafka/pull/1768

            KAFKA-4051: Use nanosecond clock for timers in broker

            Use System.nanoseconds instead of System.currentTimeMillis in broker timer tasks to cope with changes to wall-clock time.

            You can merge this pull request into a Git repository by running:

            $ git pull https://github.com/rajinisivaram/kafka KAFKA-4051

            Alternatively you can review and apply these changes as the patch at:

            https://github.com/apache/kafka/pull/1768.patch

            To close this pull request, make a commit to your master/trunk branch
            with (at least) the following in the commit message:

            This closes #1768


            commit 366659f9a6a0212030349bc54e0c798e3d6c2122
            Author: Rajini Sivaram <rajinisivaram@googlemail.com>
            Date: 2016-08-19T10:13:03Z

            KAFKA-4051: Use nanosecond clock for timers in broker


            githubbot ASF GitHub Bot added a comment - GitHub user rajinisivaram opened a pull request: https://github.com/apache/kafka/pull/1768 KAFKA-4051 : Use nanosecond clock for timers in broker Use System.nanoseconds instead of System.currentTimeMillis in broker timer tasks to cope with changes to wall-clock time. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rajinisivaram/kafka KAFKA-4051 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1768.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1768 commit 366659f9a6a0212030349bc54e0c798e3d6c2122 Author: Rajini Sivaram <rajinisivaram@googlemail.com> Date: 2016-08-19T10:13:03Z KAFKA-4051 : Use nanosecond clock for timers in broker
            rsivaram Rajini Sivaram added a comment -

            Agree with onurkaraman that this is an unusual scenario. But the fix is a small change that can be seen as one small step towards improving Kafka's resilience to wall-clock time changes. The PR avoids the need for a broker restart in this case, but perhaps ibarra can provide more context to the problem scenario.

            rsivaram Rajini Sivaram added a comment - Agree with onurkaraman that this is an unusual scenario. But the fix is a small change that can be seen as one small step towards improving Kafka's resilience to wall-clock time changes. The PR avoids the need for a broker restart in this case, but perhaps ibarra can provide more context to the problem scenario.
            gabriel.ibarra Gabriel Ibarra added a comment -

            Hi, I'm sorry for the delay.
            I agree with you guys, it is not a typical scenario, it is even difficult for me to think about the reasons for a System Administrator to change the data/time; but he can, and as Rajini Sivaram said the impact is quite big if it does happen.

            Added to that, our system is working in a VM with NTP, and we see some spurious changes on the system time (this way we detected this issue), we are now analyzing why the time is changing, but we suspect that the system start with the hardware time and then NTP synchronize the system clock using the time zone configured in the VM.

            It is a good news that it could be fixed with small changes. Great Job Rajini

            gabriel.ibarra Gabriel Ibarra added a comment - Hi, I'm sorry for the delay. I agree with you guys, it is not a typical scenario, it is even difficult for me to think about the reasons for a System Administrator to change the data/time; but he can, and as Rajini Sivaram said the impact is quite big if it does happen. Added to that, our system is working in a VM with NTP, and we see some spurious changes on the system time (this way we detected this issue), we are now analyzing why the time is changing, but we suspect that the system start with the hardware time and then NTP synchronize the system clock using the time zone configured in the VM. It is a good news that it could be fixed with small changes. Great Job Rajini
            gwenshap Gwen Shapira added a comment -

            Issue resolved by pull request 1768
            https://github.com/apache/kafka/pull/1768

            gwenshap Gwen Shapira added a comment - Issue resolved by pull request 1768 https://github.com/apache/kafka/pull/1768
            githubbot ASF GitHub Bot added a comment -

            Github user asfgit closed the pull request at:

            https://github.com/apache/kafka/pull/1768

            githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/1768

            People

              rsivaram Rajini Sivaram
              gabriel.ibarra Gabriel Ibarra
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: