Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-9211

kafka upgrade 2.3.0 may cause tcp delay ack(Congestion Control)

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 2.3.0
    • Fix Version/s: None
    • Component/s: controller, producer
    • Labels:
      None

      Description

      Recently we try upgrade kafka from 0.10.0.1 to 2.3.0.

      we have 15 clusters in production env, each one has 3~6 brokers.

      we know kafka upgrade should:
            1.replcae code to 2.3.0.jar and restart  all brokers one by one
            2.unset inter.broker.protocol.version=0.10.0.1 and restart all brokers one by one
            3.unset log.message.format.version=0.10.0.1 and restart all brokers one by one
       
      for now we have already done step 1 & 2 in 12 clusters.but when we try to upgrade left clusters (already done step 1) in step 2, we found some topics drop produce speed badly.
           we have research this issue for long time, since we couldn't test it in production environment  and we couldn't reproduce in test environment, we couldn't find the root cause.
      now we only could describe the situation in detail as  i know, hope anyone could help us.
       
      1.because bug KAFKA-8653, i add code below in KafkaApis.scala handleJoinGroupRequest function:

      if (rebalanceTimeoutMs <= 0) {
       rebalanceTimeoutMs = joinGroupRequest.data.sessionTimeoutMs
      }

      2.one cluster upgrade failed has 6 8C16G brokers, about 200 topics with 2 replicas,every broker keep 3000+ partitions and 1500+ leader partition, but most of them has very low produce message speed,about less than 50messages/sec, only one topic with 300 partitions has more than 2500 message/sec with more than 20 consumer groups consume message from it.

      so this whole cluster  produce 4K messages/sec , 11m Bytes in /sec,240m Bytes out /sec.and more than 90% traffic made by that topic has 2500messages/sec.

      when we unset 5 or 6 servers' inter.broker.protocol.version=0.10.0.1  and restart, this topic produce message drop to about 200messages/sec,  i don't know whether the way we use could tirgger any problem.

      3.we use kafka wrapped by spring-kafka and set kafkatemplate's autoFlush=true, so each producer.send execution will execute producer.flush immediately too.i know flush method will decrease produce performance dramaticlly, but  at least it seems nothing wrong before upgrade step 2. but i doubt whether it's a problem now after upgrade.

      4.I noticed when produce speed decrease, some consumer group has large message lag still consume message without any consume speed change or decrease, so I guess only producerequest speed will drop down,but fetchrequest not. 

      5.we haven't set any throttle configuration, and all producers' acks=1(so it's not broker replica fetch slow), and when this problem triggered, both sever & producers cpu usage down, and servers' ioutil keep less than 30% ,so it shuldn't be a hardware problem.

      6.this event triggered often(almost 100%) most brokers has done upgrade step 2,then after a auto leader replica election executed, then we can observe  produce speed drop down,and we have to downgrade brokers(set inter.broker.protocol.version=0.10.0.1)and restart brokers one by one,then it could be normal. some cluster have to downgrade all brokers,but some cluster could left 1 or 2 brokers without downgrade, i notice that the broker not need downgrade is the controller.

      7.I have print jstack for producer & servers. although I do this not the same cluster, but we can notice that their thread seems really in idle stat.

      8.both 0.10.0.1 & 2.3.0 kafka-client will trigger this problem too.

      8.unless the largest one topic will drop produce speed certainly, other topic will drop produce speed randomly. maybe topicA will drop speed in first upgrade attempt but next not, and topicB not drop speed in first attemp but dropped when do another attempt.

      9.in fact, the largest cluster, has the same topic & group usage scenario mentioned above, but the largest topic has 1w2 messages/sec,will upgrade fail in step 1(just use 2.3.0.jar)

      any help would be grateful, thx, i'm very sad now...

        Attachments

        1. producer-jstack.txt
          864 kB
          li xiangyuan
        2. broker-jstack.txt
          98 kB
          li xiangyuan
        3. producer.node.latency.png
          93 kB
          li xiangyuan
        4. ackdelay.txt
          329 kB
          li xiangyuan
        5. nodelay.txt
          448 kB
          li xiangyuan

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              flashmouse li xiangyuan
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Due:
                Created:
                Updated: