Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.8.1
    • 0.8.1.1
    • None
    • None

    Description

      This is part of the investigation into slow shutdowns in 0.8.1. While
      logging contributes to bulk of the regression, this one also adds
      quite a bit of overhead:

      In addLeaderAndIsrRequest (called for every partition that is led by the
      broker being shut down) we also add an UpdateMetadataRequest - each call to
      addUpdateMetadataRequests does two traversals over all (global)
      partitions. I think it should be straightforward to optimize this a bit.

      Marking as critical, since it is not as big an overhead as the logging.

      Attachments

        1. KAFKA-1355_2014-04-04_13:48:34.patch
          16 kB
          Joel Jacob Koshy
        2. KAFKA-1355_2014-04-04_13:51:22.patch
          16 kB
          Joel Jacob Koshy
        3. KAFKA-1355_2014-04-17_14:48:57.patch
          16 kB
          Joel Jacob Koshy
        4. KAFKA-1355.patch
          16 kB
          Joel Jacob Koshy

        Issue Links

        Activity

          BTW, here are the configs and steps I used for this, KAFKA-1342 and KAFKA-1350:

          Four brokers, 100 topics, eight partitions each.

          Log4j and server configs by broker:
          https://gist.github.com/anonymous/9906088
          https://gist.github.com/anonymous/9906092
          https://gist.github.com/anonymous/9906096
          https://gist.github.com/anonymous/9906102
          https://gist.github.com/anonymous/9906144
          https://gist.github.com/anonymous/9906148
          https://gist.github.com/anonymous/9906153
          https://gist.github.com/anonymous/9906157

          Producer performance: https://gist.github.com/anonymous/9906163

          (At the end, just grep in the controller's request log and extract local time)
          grep ControlledShutdownRequest logs/kafka-request*

          jjkoshy Joel Jacob Koshy added a comment - BTW, here are the configs and steps I used for this, KAFKA-1342 and KAFKA-1350 : Four brokers, 100 topics, eight partitions each. Log4j and server configs by broker: https://gist.github.com/anonymous/9906088 https://gist.github.com/anonymous/9906092 https://gist.github.com/anonymous/9906096 https://gist.github.com/anonymous/9906102 https://gist.github.com/anonymous/9906144 https://gist.github.com/anonymous/9906148 https://gist.github.com/anonymous/9906153 https://gist.github.com/anonymous/9906157 Producer performance: https://gist.github.com/anonymous/9906163 (At the end, just grep in the controller's request log and extract local time) grep ControlledShutdownRequest logs/kafka-request*
          jjkoshy Joel Jacob Koshy added a comment - https://reviews.apache.org/r/20038

          Updated reviewboard https://reviews.apache.org/r/20038/
          against branch origin/trunk

          jjkoshy Joel Jacob Koshy added a comment - Updated reviewboard https://reviews.apache.org/r/20038/ against branch origin/trunk

          Updated reviewboard https://reviews.apache.org/r/20038/
          against branch origin/trunk

          jjkoshy Joel Jacob Koshy added a comment - Updated reviewboard https://reviews.apache.org/r/20038/ against branch origin/trunk

          Committed to trunk (including the comment fix in Jun's follow-up review).

          Need patch for 0.8.1

          jjkoshy Joel Jacob Koshy added a comment - Committed to trunk (including the comment fix in Jun's follow-up review). Need patch for 0.8.1

          Created reviewboard https://reviews.apache.org/r/20232/
          against branch origin/0.8.1

          jjkoshy Joel Jacob Koshy added a comment - Created reviewboard https://reviews.apache.org/r/20232/ against branch origin/0.8.1
          nehanarkhede Neha Narkhede added a comment -

          Joel Jacob Koshy Should we also check in this patch to 0.8.1. I'm not sure if we waiting on something?

          nehanarkhede Neha Narkhede added a comment - Joel Jacob Koshy Should we also check in this patch to 0.8.1. I'm not sure if we waiting on something?

          Tim's patch in 1363 conflicts with this. So we can get 1363 in first, and I will rebase this one.

          jjkoshy Joel Jacob Koshy added a comment - Tim's patch in 1363 conflicts with this. So we can get 1363 in first, and I will rebase this one.
          nehanarkhede Neha Narkhede added a comment -

          KAFKA-1363 is in 0.8.1 as well as trunk now.

          nehanarkhede Neha Narkhede added a comment - KAFKA-1363 is in 0.8.1 as well as trunk now.

          Sorry - I meant KAFKA-1356, not 1363 - will check that in first after reviewing and rebase this.

          jjkoshy Joel Jacob Koshy added a comment - Sorry - I meant KAFKA-1356 , not 1363 - will check that in first after reviewing and rebase this.

          That has been marked as closed, but https://reviews.apache.org/r/20252/ has not been checked into 0.8.1

          jjkoshy Joel Jacob Koshy added a comment - That has been marked as closed, but https://reviews.apache.org/r/20252/ has not been checked into 0.8.1

          Updated reviewboard https://reviews.apache.org/r/20232/
          against branch origin/0.8.1

          jjkoshy Joel Jacob Koshy added a comment - Updated reviewboard https://reviews.apache.org/r/20232/ against branch origin/0.8.1

          Thanks for the review. Committed to 0.8.1 as well.

          jjkoshy Joel Jacob Koshy added a comment - Thanks for the review. Committed to 0.8.1 as well.

          People

            Unassigned Unassigned
            jjkoshy Joel Jacob Koshy
            Votes:
            1 Vote for this issue
            Watchers:
            Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack