Kafka
  1. Kafka
  2. KAFKA-332

Mirroring should use multiple producers; add producer retries to DefaultEventHandler

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.1
    • Component/s: core
    • Labels:
      None

      Description

      I'm clubbing these two together as these are both important for mirroring.

      (1) Multiple producers:

      Shallow iteration (KAFKA-315) helps improve mirroring throughput when
      messages are compressed. With shallow iteration, the mirror-maker's consumer
      does not do deep iteration over compressed messages. However, when its
      embedded producer sends these messages to the target cluster's brokers, the
      receiving broker does deep iteration to validate the messages before
      appending to the log.

      In the current (pre- KAFKA-48) request handling mechanism, one producer
      effectively translates to one server-side thread for handling produce
      requests, so there is still a bottleneck due to decompression (due to
      message validation) on the target broker.

      One way to work around this is to use broker.list with multiple brokers
      specified per broker. E.g.,
      broker.list=0:localhost:9191,1:localhost:9191,2:localhost:9191,... which
      effectively emulates multiple server-side threads. It would be better to
      just add a num.producers option to the mirror-maker and instantiate that
      many producers.

      (2) Retries:

      If the mirror-maker uses broker.list and one of the brokers is bounced for
      any reason, messages can get lost. Message loss can be reduced/avoided if
      the brokers are behind a VIP and if retries are supported. This option will
      not work for the zk-based producer because the decision of which broker to
      send to has already been made, so retries would go to the same (potentially
      still down) broker. (With KAFKA-253 it would work for zk-based producers as
      well, but that is only in 0.8).

      1. KAFKA-332-v3.patch
        12 kB
        Joel Koshy
      2. KAFKA-332-v2.patch
        11 kB
        Joel Koshy
      3. KAFKA-332-v1.patch
        11 kB
        Joel Koshy

        Activity

        Hide
        Jun Rao added a comment -

        Thanks for the patch. Committed to trunk.

        Show
        Jun Rao added a comment - Thanks for the patch. Committed to trunk.
        Hide
        Joel Koshy added a comment -

        tryToHandle catches all now.

        Show
        Joel Koshy added a comment - tryToHandle catches all now.
        Hide
        Jun Rao added a comment -

        Patch v2 looks good. One more comment:
        4. ProducerSendThread.tryToHandle() should catch Throwable too.

        Show
        Jun Rao added a comment - Patch v2 looks good. One more comment: 4. ProducerSendThread.tryToHandle() should catch Throwable too.
        Hide
        Joel Koshy added a comment -

        Thanks for the review.

        • Captured throwable
        • Trace logging retry attempt
        • Additional doc on num.retries

        Circular iterator: I thought it would be a convenient pattern for round-robin selection and in fact makes the code simpler/clearer. If it is hard to read, we can just go with explicit modulo-based selection based on a counter - although IMO that is less clean.

        Show
        Joel Koshy added a comment - Thanks for the review. Captured throwable Trace logging retry attempt Additional doc on num.retries Circular iterator: I thought it would be a convenient pattern for round-robin selection and in fact makes the code simpler/clearer. If it is hard to read, we can just go with explicit modulo-based selection based on a counter - although IMO that is less clean.
        Hide
        Jun Rao added a comment -

        Some comments:
        1. DefaultEventHandler:
        1.1. It would be useful to see the retry # in trace log
        1.2 We should capture all Throwable.

        2. ProducerConfig: explain a bit more why num.retries is not appropriate for zk-based producer. Basically, during resend, we don't re-select brokers.

        3. MirrorMaker: The usage of circularIterator is pretty fancy. Would it be simpler to just put all producers in an array and loop through it circularly?

        Show
        Jun Rao added a comment - Some comments: 1. DefaultEventHandler: 1.1. It would be useful to see the retry # in trace log 1.2 We should capture all Throwable. 2. ProducerConfig: explain a bit more why num.retries is not appropriate for zk-based producer. Basically, during resend, we don't re-select brokers. 3. MirrorMaker: The usage of circularIterator is pretty fancy. Would it be simpler to just put all producers in an array and loop through it circularly?

          People

          • Assignee:
            Joel Koshy
            Reporter:
            Joel Koshy
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development