Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-1016

Broker should limit purgatory size

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 0.8.0
    • None
    • purgatory
    • None

    Description

      I recently ran into a case where a poorly configured Kafka consumer was able to trigger out of memory exceptions in multiple Kafka brokers. The consumer was configured to have a fetcher.max.wait of Int.MaxInt.

      For low volume topics, this configuration causes the consumer to block for frequently, and for long periods of time. junrao informs me that the fetch request will time out after the socket timeout is reached. In our case, this was set to 30s.

      With several thousand consumer threads, the fetch request purgatory got into the 100,000-400,000 range, which we believe triggered the out of memory exception. nehanarkhede claims to have seem similar behavior in other high volume clusters.

      It kind of seems like a bad thing that a poorly configured consumer can trigger out of memory exceptions in the broker. I was thinking maybe it makes sense to have the broker try and protect itself from this situation. Here are some potential solutions:

      1. Have a broker-side max wait config for fetch requests.
      2. Threshold the purgatory size, and either drop the oldest connections in purgatory, or reject the newest fetch requests when purgatory is full.

      Attachments

        Issue Links

          Activity

            People

              jjkoshy Joel Jacob Koshy
              criccomini Chris Riccomini
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: