Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-1355

Reduce/optimize update metadata requests sent during leader election

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.8.1
    • 0.8.1.1
    • None
    • None

    Description

      This is part of the investigation into slow shutdowns in 0.8.1. While
      logging contributes to bulk of the regression, this one also adds
      quite a bit of overhead:

      In addLeaderAndIsrRequest (called for every partition that is led by the
      broker being shut down) we also add an UpdateMetadataRequest - each call to
      addUpdateMetadataRequests does two traversals over all (global)
      partitions. I think it should be straightforward to optimize this a bit.

      Marking as critical, since it is not as big an overhead as the logging.

      Attachments

        1. KAFKA-1355_2014-04-04_13:48:34.patch
          16 kB
          Joel Jacob Koshy
        2. KAFKA-1355_2014-04-04_13:51:22.patch
          16 kB
          Joel Jacob Koshy
        3. KAFKA-1355_2014-04-17_14:48:57.patch
          16 kB
          Joel Jacob Koshy
        4. KAFKA-1355.patch
          16 kB
          Joel Jacob Koshy

        Issue Links

          Activity

            BTW, here are the configs and steps I used for this, KAFKA-1342 and KAFKA-1350:

            Four brokers, 100 topics, eight partitions each.

            Log4j and server configs by broker:
            https://gist.github.com/anonymous/9906088
            https://gist.github.com/anonymous/9906092
            https://gist.github.com/anonymous/9906096
            https://gist.github.com/anonymous/9906102
            https://gist.github.com/anonymous/9906144
            https://gist.github.com/anonymous/9906148
            https://gist.github.com/anonymous/9906153
            https://gist.github.com/anonymous/9906157

            Producer performance: https://gist.github.com/anonymous/9906163

            (At the end, just grep in the controller's request log and extract local time)
            grep ControlledShutdownRequest logs/kafka-request*

            jjkoshy Joel Jacob Koshy added a comment - BTW, here are the configs and steps I used for this, KAFKA-1342 and KAFKA-1350 : Four brokers, 100 topics, eight partitions each. Log4j and server configs by broker: https://gist.github.com/anonymous/9906088 https://gist.github.com/anonymous/9906092 https://gist.github.com/anonymous/9906096 https://gist.github.com/anonymous/9906102 https://gist.github.com/anonymous/9906144 https://gist.github.com/anonymous/9906148 https://gist.github.com/anonymous/9906153 https://gist.github.com/anonymous/9906157 Producer performance: https://gist.github.com/anonymous/9906163 (At the end, just grep in the controller's request log and extract local time) grep ControlledShutdownRequest logs/kafka-request*
            jjkoshy Joel Jacob Koshy added a comment - https://reviews.apache.org/r/20038

            Updated reviewboard https://reviews.apache.org/r/20038/
            against branch origin/trunk

            jjkoshy Joel Jacob Koshy added a comment - Updated reviewboard https://reviews.apache.org/r/20038/ against branch origin/trunk

            Updated reviewboard https://reviews.apache.org/r/20038/
            against branch origin/trunk

            jjkoshy Joel Jacob Koshy added a comment - Updated reviewboard https://reviews.apache.org/r/20038/ against branch origin/trunk

            Committed to trunk (including the comment fix in Jun's follow-up review).

            Need patch for 0.8.1

            jjkoshy Joel Jacob Koshy added a comment - Committed to trunk (including the comment fix in Jun's follow-up review). Need patch for 0.8.1

            Created reviewboard https://reviews.apache.org/r/20232/
            against branch origin/0.8.1

            jjkoshy Joel Jacob Koshy added a comment - Created reviewboard https://reviews.apache.org/r/20232/ against branch origin/0.8.1
            nehanarkhede Neha Narkhede added a comment -

            jjkoshy Should we also check in this patch to 0.8.1. I'm not sure if we waiting on something?

            nehanarkhede Neha Narkhede added a comment - jjkoshy Should we also check in this patch to 0.8.1. I'm not sure if we waiting on something?

            Tim's patch in 1363 conflicts with this. So we can get 1363 in first, and I will rebase this one.

            jjkoshy Joel Jacob Koshy added a comment - Tim's patch in 1363 conflicts with this. So we can get 1363 in first, and I will rebase this one.
            nehanarkhede Neha Narkhede added a comment -

            KAFKA-1363 is in 0.8.1 as well as trunk now.

            nehanarkhede Neha Narkhede added a comment - KAFKA-1363 is in 0.8.1 as well as trunk now.

            Sorry - I meant KAFKA-1356, not 1363 - will check that in first after reviewing and rebase this.

            jjkoshy Joel Jacob Koshy added a comment - Sorry - I meant KAFKA-1356 , not 1363 - will check that in first after reviewing and rebase this.

            That has been marked as closed, but https://reviews.apache.org/r/20252/ has not been checked into 0.8.1

            jjkoshy Joel Jacob Koshy added a comment - That has been marked as closed, but https://reviews.apache.org/r/20252/ has not been checked into 0.8.1

            Updated reviewboard https://reviews.apache.org/r/20232/
            against branch origin/0.8.1

            jjkoshy Joel Jacob Koshy added a comment - Updated reviewboard https://reviews.apache.org/r/20232/ against branch origin/0.8.1

            Thanks for the review. Committed to 0.8.1 as well.

            jjkoshy Joel Jacob Koshy added a comment - Thanks for the review. Committed to 0.8.1 as well.

            People

              Unassigned Unassigned
              jjkoshy Joel Jacob Koshy
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: