Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-2318

replica manager repeatedly tries to fetch from partitions already moved during controlled shutdown

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Auto Closed
    • None
    • None
    • replication
    • None

    Description

      Using version 0.8.2.1.
      During a controlled shutdown, it seems like the left-hand is often not talking to the right
      In this case, we see the ReplicaManager remove a fetcher for a partition, truncate it's log, and then apparently try to fetch data from that partition repeatedly, spamming the log with "failed due to Leader not local for partition" warnings.

      Below is a snippet (in this case it happened for partition '_consumer_offsets,7' and '_consumer_offsets,47'). It went on for quite a bit longer than included here. The current broker is '99' here.

      2015-07-07 18:54:26,415  INFO [kafka-request-handler-0] server.ReplicaFetcherManager - [ReplicaFetcherManager on broker 99] Removed fetcher for partitions [__consumer_offsets,7]
      2015-07-07 18:54:26,415  INFO [kafka-request-handler-0] log.Log - Truncating log __consumer_offsets-7 to offset 0.
      2015-07-07 18:54:26,421  WARN [kafka-request-handler-3] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 6832556 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,429  WARN [kafka-request-handler-4] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345717 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,430  WARN [kafka-request-handler-2] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345718 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,431  WARN [kafka-request-handler-4] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345719 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,432  WARN [kafka-request-handler-5] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345720 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,433  WARN [kafka-request-handler-2] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345721 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,434  WARN [kafka-request-handler-3] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345722 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,436  WARN [kafka-request-handler-1] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345723 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,437  WARN [kafka-request-handler-2] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345724 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,438  WARN [kafka-request-handler-7] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345725 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,438  INFO [kafka-request-handler-6] server.ReplicaFetcherManager - [ReplicaFetcherManager on broker 99] Removed fetcher for partitions [__consumer_offsets,47]
      2015-07-07 18:54:26,438  INFO [kafka-request-handler-6] log.Log - Truncating log __consumer_offsets-47 to offset 0.
      2015-07-07 18:54:26,439  WARN [kafka-request-handler-1] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345726 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,443  WARN [kafka-request-handler-3] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345727 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,446  WARN [kafka-request-handler-5] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 6832559 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,47] failed due to Leader not local for partition [__consumer_offsets,47] on broker 99
      2015-07-07 18:54:26,446  WARN [kafka-request-handler-0] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345728 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      2015-07-07 18:54:26,447  WARN [kafka-request-handler-1] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 6832560 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,47] failed due to Leader not local for partition [__consumer_offsets,47] on broker 99
      2015-07-07 18:54:26,447  WARN [kafka-request-handler-2] server.ReplicaManager - [Replica Manager on Broker 99]: Fetch request with correlation id 4345729 from client ReplicaFetcherThread-0-99 on partition [__consumer_offsets,7] failed due to Leader not local for partition [__consumer_offsets,7] on broker 99
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            jbrosenberg@gmail.com Jason Rosenberg
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: