Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-9171

DelayedFetch completion may throw exception, causing successful produce to be failed

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 2.4.0
    • Component/s: core
    • Labels:
      None

      Description

      I was looking at the logs of the system test failure of ReassignPartitionsTest.

      Logs show produce error ReplicaNotAvailableException for two records in the producer log, but the data logs of all the brokers contain the records. The offsets of these records are returned as successful produce for two subsequent records which don't appear in the logs and hence the test failed.

      Broker logs of the leader at the time of the reassignment and leader change show:

       

      {{[2019-11-11 07:23:17,727] ERROR [ReplicaManager broker=3] Error processing append operation on partition test_topic-17 (kafka.server.ReplicaManager)
      org.apache.kafka.common.errors.ReplicaNotAvailableException: Partition test_topic-5 is not available}}

      This is failing the append operation on `test_topic-17` when a different partition `test_topic-5` was unavailable for fetch. I think it is fetch since produce would have thrown NotLeaderForPartitionException rather than ReplicaNotAvailableException.

      We don't expect DelayedFetch to throw exceptions and it looks like we are not handling `ReplicaNotAvailableException`.

      I am not sure if this fixes the issues with ReassignPartitionsTest, but this seems to a scenario that we should fix.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                rsivaram Rajini Sivaram
                Reporter:
                rsivaram Rajini Sivaram
                Reviewer:
                Ismael Juma
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: