Kafka
  1. Kafka
  2. KAFKA-1510

Force offset commits when migrating consumer offsets from zookeeper to kafka

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.2.0
    • Fix Version/s: 0.8.2.0
    • Component/s: None
    • Labels:

      Description

      When migrating consumer offsets from ZooKeeper to kafka, we have to turn on dual-commit (i.e., the consumers will commit offsets to both zookeeper and kafka) in addition to setting offsets.storage to kafka. However, when we commit offsets we only commit offsets if they have changed (since the last commit). For low-volume topics or for topics that receive data in bursts offsets may not move for a long period of time. Therefore we may want to force the commit (even if offsets have not changed) when migrating (i.e., when dual-commit is enabled) - we can add a minimum interval threshold (say force commit after every 10 auto-commits) as well as on rebalance and shutdown.

      Also, I think it is safe to switch the default for offsets.storage from zookeeper to kafka and set the default to dual-commit (for people who have not migrated yet). We have deployed this to the largest consumers at linkedin and have not seen any issues so far (except for the migration caveat that this jira will resolve).

        Activity

        Joel Koshy made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        nicu marasoiu made changes -
        nicu marasoiu made changes -
        Attachment Unfiltered_to_kafka,_Incremental_to_Zookeeper.patch [ 12665378 ]
        nicu marasoiu made changes -
        Attachment Unfiltered_to_kafka,_Incremental_to_Zookeeper.patch [ 12665378 ]
        Joel Koshy made changes -
        Reviewer Joel Koshy [ jjkoshy ]
        nicu marasoiu made changes -
        Comment [ The patch makes the simplest choices:
        1. unfiltered commits when storage=kafka (unfiltered to both storages if the case).
        2. unfiltered retries (even if some of the offsets have already been properly sent in previous attempts)

        A way to solve point 2 in a more general context would be to account for freshly committed offsets, not just offsets values changes when deciding to filter an offset or not. ]
        nicu marasoiu made changes -
        Attachment kafka-1510.patch [ 12659527 ]
        nicu marasoiu made changes -
        Attachment kafka-1510.patch [ 12659527 ]
        nicu marasoiu made changes -
        Attachment forceCommitOnShutdownWhenDualCommit.patch [ 12658005 ]
        nicu marasoiu made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Assignee nicu marasoiu [ nmarasoi ] Joel Koshy [ jjkoshy ]
        nicu marasoiu made changes -
        Comment [ attached the patch with the meaning detailed in my prev comment ]
        nicu marasoiu made changes -
        Comment [ forcing all to zk too does indeed have the drawback that it will typically copy the same offsets again, and not only once but potentially several times (if kafka is retried).

        However the alternative is to commit to both kafka and zookeeper unconditionally in the normal flow (right now, the commit to zk happens only after a successful commit to kafka if any). That too poses the same risk of committing multiple times to a system (zk) if the other (kafka) needs retries. So a clean way here would be a completely different OffsetDAO implementation, one on kafka , one on zookeeper, and one on dual mode, and read, as now max(both), while write goes to the 2 implementations, each of them doing retries without affecting the other!
        ]
        nicu marasoiu made changes -
        Attachment forceCommitOnShutdownWhenDualCommit.patch [ 12658005 ]
        Joel Koshy made changes -
        Summary Force offset commits at a minimum interval when migrating consumer offsets from zookeeper to kafka Force offset commits when migrating consumer offsets from zookeeper to kafka
        nicu marasoiu made changes -
        Assignee nicu marasoiu [ nmarasoi ]
        Guozhang Wang made changes -
        Field Original Value New Value
        Labels newbie
        Joel Koshy created issue -

          People

          • Assignee:
            Joel Koshy
            Reporter:
            Joel Koshy
            Reviewer:
            Joel Koshy
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development