Kafka
  1. Kafka
  2. KAFKA-197

Embedded consumer doesn't shut down if the server can't start

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7
    • Fix Version/s: 0.7.1
    • Component/s: None
    • Labels:
      None

      Description

      If a broker embeds a consumer and the broker itself doesn't start (e.g., conflicting broker id in ZK), the embedded consumer is still running. In this case, we should probably shut down the embedded consumer too.

      To do this, we need to either throw an exception or return an error in KafkaServer.startup and act accordingly in KafkaServerStartable.startup.

      1. KAFKA-197.patch
        0.4 kB
        Jun Rao
      2. KAFKA-197_v2.patch
        6 kB
        Jun Rao

        Activity

        Hide
        Jun Rao added a comment -

        Just committed this.

        Show
        Jun Rao added a comment - Just committed this.
        Hide
        Neha Narkhede added a comment -

        +1 on v2

        Show
        Neha Narkhede added a comment - +1 on v2
        Hide
        Jun Rao added a comment -

        Attach patch v2. Consolidate all error handling in KafkaServerStarble.

        Show
        Jun Rao added a comment - Attach patch v2. Consolidate all error handling in KafkaServerStarble.
        Hide
        Jun Rao added a comment -

        The main reason that we moved away from writing to local log is to pick up the compression support in the high level producer. Decoupling the embedded consumer from the broker may not be a bad idea. There is one more service/process that one has to manage. However, it's probably more flexible (to support things like consuming from multiple sources and plugging in logic for consumer-side auditing) and is less intrusive to the core Kafka code.

        Show
        Jun Rao added a comment - The main reason that we moved away from writing to local log is to pick up the compression support in the high level producer. Decoupling the embedded consumer from the broker may not be a bad idea. There is one more service/process that one has to manage. However, it's probably more flexible (to support things like consuming from multiple sources and plugging in logic for consumer-side auditing) and is less intrusive to the core Kafka code.
        Hide
        Jay Kreps added a comment -

        A less invasive way would just be to have the embedded consumer register a shutdown hook and use System.exit.

        I am a little concerned about this whole embedded consumer thing, though. The original approach where we wrote to the local log in process was pretty fool proof. I think sending to a remote broker is actually riddled with issues. The producer send buffer is vulnerable to quite a large loss on any unclean shutdown or indeed any shutdown bugs. And also any condition that leads to a broker being unable to take requests but still registered in zk will lead to unbounded data loss. I wonder if this issue isn't just a special case of many many bad things that could happen.

        With the current approach I actually don't see any benefits at all to bundling the replication process with the kafka broker. It would actually be better to have that run independently it seems to me.

        Show
        Jay Kreps added a comment - A less invasive way would just be to have the embedded consumer register a shutdown hook and use System.exit. I am a little concerned about this whole embedded consumer thing, though. The original approach where we wrote to the local log in process was pretty fool proof. I think sending to a remote broker is actually riddled with issues. The producer send buffer is vulnerable to quite a large loss on any unclean shutdown or indeed any shutdown bugs. And also any condition that leads to a broker being unable to take requests but still registered in zk will lead to unbounded data loss. I wonder if this issue isn't just a special case of many many bad things that could happen. With the current approach I actually don't see any benefits at all to bundling the replication process with the kafka broker. It would actually be better to have that run independently it seems to me.
        Hide
        Neha Narkhede added a comment -

        Would it be reasonable to have KafkaServerStartable register a callback with KafkaServer, and have the shutdown API of KafkaServer invoke that callback ? That way, we can ensure that KafkaServerStartable can cleanly shutdown the embedded consumer when the server is shutdown for some reason.

        Show
        Neha Narkhede added a comment - Would it be reasonable to have KafkaServerStartable register a callback with KafkaServer, and have the shutdown API of KafkaServer invoke that callback ? That way, we can ensure that KafkaServerStartable can cleanly shutdown the embedded consumer when the server is shutdown for some reason.
        Hide
        Jay Kreps added a comment -

        I don't think we want to call halt(), that is like kill -9 the process. I think we want the logs to flush and shutdown gracefully. Can't we just do a graceful shutdown on both the server and the embedded consumer?

        Show
        Jay Kreps added a comment - I don't think we want to call halt(), that is like kill -9 the process. I think we want the logs to flush and shutdown gracefully. Can't we just do a graceful shutdown on both the server and the embedded consumer?
        Hide
        Jun Rao added a comment -

        Patch attached.

        Show
        Jun Rao added a comment - Patch attached.

          People

          • Assignee:
            Jun Rao
            Reporter:
            Jun Rao
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development