Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-5709

Improve logging to include errors from state-change log in server log

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.10.2.0
    • None
    • log
    • None

    Description

      Problem
      The following message was generated over and over again when running kafka-console-producer or kafka-console-consumer with SSL and ACLs enabled

      WARN Error while fetching metadata with correlation id 1 : {test=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)` endlessly when I run kafka-console-producer or kafka-console-consumer
      

      however server log (or authorizer log) did not indicate any problem.

      Background
      1) Initially, security.inter.broker.protocol setting was missing in server.properties, so connection was falling back to using plaintext port (9092), and only state-change.log actually contained the underlying error:

      [2017-08-04 13:40:48,536] TRACE Controller 0 epoch 6 received response {error_code=31,partitions=[{topic=test,partition=0,error_code=31},{topic=__confluent.support.metrics,partition=0,error_code=31}]} for a request sent to broker localhost:9092 (id: 0 rack: null) (state.change.logger)
      

      as per
      https://kafka.apache.org/protocol#protocol_error_codes

      CLUSTER_AUTHORIZATION_FAILED	31	False	Cluster authorization failed.
      

      2) After setting "security.inter.broker.protocol=SSL" the port changed to secure (9093) yet the error in state-change log did not go away:

      [2017-08-04 13:49:38,462] TRACE Controller 0 epoch 7 received response {error_code=31} for a request sent to broker localhost:9093 (id: 0 rack: null) (state.change.logger)
      

      and LEADER_NOT_AVAILABLE was still generated.

      This time though, kafka-authorizer.log had a better indication of the problem:

      [2017-08-04 18:17:46,770] DEBUG No acl found for resource Cluster:kafka-cluster, authorized = false (kafka.authorizer.logger)
      [2017-08-04 18:17:46,770] DEBUG Principal = User:CN=Unknown,OU=Unknown,O=Unknown,L=Unknown,ST=Unknown,C=Unknown is Denied Operation = ClusterAction from host = 127.0.0.1 on resource = Cluster:kafka-cluster (kafka.authorizer.logger)
      

      The issue being that topic metadata is not propagated successfully from controller to broker since the broker user doesn't have ClusterAction permission.
      Fixed by

      bin/kafka-acls --authorizer-properties zookeeper.connect=localhost:2181 --add --allow-principal "User:CN=Unknown,OU=Unknown,O=Unknown,L=Unknown,ST=Unknown,C=Unknown" --operation ClusterAction --cluster
      

      Request
      The debugging is tricky since the controller to broker logging is done in controller/state-change log, not in the main server log.
      We need to improve the logging on this.

      Attachments

        Activity

          People

            Unassigned Unassigned
            alla@confluent.io Alla Tumarkin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: