Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-16555

Consumer's RequestState has incorrect logic to determine if inflight

    XMLWordPrintableJSON

Details

    Description

      When running system tests for the new consumer, I've hit an issue where the HeartbeatRequestManager is sending out multiple concurrent CONSUMER_GROUP_REQUEST RPCs. The effect is the coordinator creates multiple members which causes downstream assignment problems.

      Here's the order of events:

      • Time 202: HearbeatRequestManager.poll() determines it's OK to send a request. In so doing, it updates the RequestState's lastSentMs to the current timestamp, 202
      • Time 236: the response is received and response handler is invoked, setting the RequestState's lastReceivedMs to the current timestamp, 236
      • Time 236: HearbeatRequestManager.poll() is invoked again, and it sees that it's OK to send a request. It creates another request, once again updating the RequestState's lastSentMs to the current timestamp, 236
      • Time 237: HearbeatRequestManager.poll() is invoked again, and ERRONEOUSLY decides it's OK to send another request, despite one already in flight.

      Here's the problem with requestInFlight():

      public boolean requestInFlight() {
          return this.lastSentMs > -1 && this.lastReceivedMs < this.lastSentMs;
      }
      

      On our case, lastReceivedMs is 236 and lastSentMs is also 236. So the received timestamp is equal to the sent timestamp, not less.

      Attachments

        Issue Links

          Activity

            People

              kirktrue Kirk True
              kirktrue Kirk True
              Lucas Brutschy Lucas Brutschy
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: