Kafka
  1. Kafka
  2. KAFKA-697

ConsoleConsumer throws InvalidConfigException for "." in client id

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      kafka.common.InvalidConfigException: client.id console-consumer-16946-ConsumerFetcherThread-console-consumer-16946_XXXX-host18.corp-1358006528116-991e3f2d-0-1 is illegal, contains a character other than ASCII alphanumerics, _ and -
      at kafka.common.Config$class.validateChars(Config.scala:32)
      at kafka.consumer.ConsumerConfig$.validateChars(ConsumerConfig.scala:25)
      at kafka.consumer.ConsumerConfig$.validateClientId(ConsumerConfig.scala:55)
      at kafka.consumer.SimpleConsumer.<init>(SimpleConsumer.scala:88)
      at kafka.server.AbstractFetcherThread.<init>(AbstractFetcherThread.scala:44)
      at kafka.consumer.ConsumerFetcherThread.<init>(ConsumerFetcherThread.scala:27)
      at kafka.consumer.ConsumerFetcherManager.createFetcherThread(ConsumerFetcherManager.scala:93)
      at kafka.server.AbstractFetcherManager.addFetcher(AbstractFetcherManager.scala:44)
      at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$3.apply(ConsumerFetcherManager.scala:75)
      at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$3.apply(ConsumerFetcherManager.scala:72)
      at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
      at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
      at scala.collection.Iterator$class.foreach(Iterator.scala:631)
      at scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:161)
      at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:194)
      at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
      at scala.collection.mutable.HashMap.foreach(HashMap.scala:80)
      at kafka.consumer.ConsumerFetcherManager$$anon$1.doWork(ConsumerFetcherManager.scala:72)
      at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)

      1. kafka-697-allow-dot.patch
        6 kB
        Swapnil Ghike
      2. kafka-697.patch
        1.0 kB
        Jun Rao

        Issue Links

          Activity

          John Fung created issue -
          Hide
          Swapnil Ghike added a comment -

          Currently we allow only alphanumerics, - and _ in the clientId, topic name and groupId. The topic name validation criteria were decided in KAFKA-495 and later while deciding to validate clientId and groupId in KAFKA-646, we tried to maintain consistency with the topic name validation.

          As discussed in KAFKA-495, we don't have any strong reason to disallow dot in the validation, but we did not see any strong reason to make special case exceptions at the time as well and decided to make the validation criteria strict.

          For now, can we change the system test to not use dot? We can add it to the set of legal characters later though if more use cases arise.

          Show
          Swapnil Ghike added a comment - Currently we allow only alphanumerics, - and _ in the clientId, topic name and groupId. The topic name validation criteria were decided in KAFKA-495 and later while deciding to validate clientId and groupId in KAFKA-646 , we tried to maintain consistency with the topic name validation. As discussed in KAFKA-495 , we don't have any strong reason to disallow dot in the validation, but we did not see any strong reason to make special case exceptions at the time as well and decided to make the validation criteria strict. For now, can we change the system test to not use dot? We can add it to the set of legal characters later though if more use cases arise.
          Hide
          Jun Rao added a comment -

          In kafka-683 we changed client.id for a consumer to be config.clientId + "-" + consumerIdString + fetcherId + sourceBroker.id. ConsumerIdString has the format of consumerHostName + currentTimeMillis + randomUUID. The problem is that consumerHostName may have ".", which breaks the naming convention. However, such a client.id is too verbose. For debugging, we need to know the client IP address. We can fix that for every type of request in a separate jira. With client IP address and correlation id, we should have enough information for debugging. So, for now, just attach a patch to revert client.id to just config.ClientId.

          Show
          Jun Rao added a comment - In kafka-683 we changed client.id for a consumer to be config.clientId + "-" + consumerIdString + fetcherId + sourceBroker.id. ConsumerIdString has the format of consumerHostName + currentTimeMillis + randomUUID. The problem is that consumerHostName may have ".", which breaks the naming convention. However, such a client.id is too verbose. For debugging, we need to know the client IP address. We can fix that for every type of request in a separate jira. With client IP address and correlation id, we should have enough information for debugging. So, for now, just attach a patch to revert client.id to just config.ClientId.
          Jun Rao made changes -
          Field Original Value New Value
          Attachment kafka-697.patch [ 12564635 ]
          Hide
          John Fung added a comment -

          Tested kafka-697.patch and it fixes the exception. Ran a full System Test and looks good.

          Show
          John Fung added a comment - Tested kafka-697.patch and it fixes the exception. Ran a full System Test and looks good.
          Hide
          Neha Narkhede added a comment - - edited

          Unfortunately, this rule of not allowing a '.' in the clientid seems very restrictive. Here, it is useful for the client id to have the hostname of the fetcher thread. This allows us to look at the kafka request log, take that client id, go the box, take a thread dump and correlate the behavior of the thread from the thread dump and the log4j using the client id string. This saves time required for troubleshooting. Since we have this restriction, we will have to take the client IP from the request log, convert it to a hostname, construct the thread name from this information and then try to understand the thread behavior from the thread dump and log4j. Also, one needs to know Kafka code to figure out how to construct the thread name from all this information, which is even worse.

          I think this is making things more difficult for us, which is a down side to disallowing '.' in the client id.

          Show
          Neha Narkhede added a comment - - edited Unfortunately, this rule of not allowing a '.' in the clientid seems very restrictive. Here, it is useful for the client id to have the hostname of the fetcher thread. This allows us to look at the kafka request log, take that client id, go the box, take a thread dump and correlate the behavior of the thread from the thread dump and the log4j using the client id string. This saves time required for troubleshooting. Since we have this restriction, we will have to take the client IP from the request log, convert it to a hostname, construct the thread name from this information and then try to understand the thread behavior from the thread dump and log4j. Also, one needs to know Kafka code to figure out how to construct the thread name from all this information, which is even worse. I think this is making things more difficult for us, which is a down side to disallowing '.' in the client id.
          Hide
          Sriram Subramanian added a comment -

          +1 on Neha's comment. It is simpler to allow '.' rather than getting the ip and adding code to pass that as part of the request object.

          Show
          Sriram Subramanian added a comment - +1 on Neha's comment. It is simpler to allow '.' rather than getting the ip and adding code to pass that as part of the request object.
          Swapnil Ghike made changes -
          Assignee Swapnil Ghike [ swapnilghike ]
          Hide
          Swapnil Ghike added a comment -

          Thanks for your comments. I agree that allowing dot will make our lives easier.

          In the attached patch, I have added dot to the regexes in Topic and Config classes. The regexes have also been modified to escape the last hyphen so that in the future if someone accidentally added a character at the end of the regex group, then the hyphen will not be treated as a range identifier.

          Topic has been modified to disallow "." and ".." as topic names.

          TopicTest and ConfigTest have been modified accordingly.

          After including this patch, we can keep Jun's patch or discard it depending on what is convenient to keep in clientId.

          John, can you please check if this patch solves the problem (after reverting Jun's patch so that the root cause is reintroduced)?

          Show
          Swapnil Ghike added a comment - Thanks for your comments. I agree that allowing dot will make our lives easier. In the attached patch, I have added dot to the regexes in Topic and Config classes. The regexes have also been modified to escape the last hyphen so that in the future if someone accidentally added a character at the end of the regex group, then the hyphen will not be treated as a range identifier. Topic has been modified to disallow "." and ".." as topic names. TopicTest and ConfigTest have been modified accordingly. After including this patch, we can keep Jun's patch or discard it depending on what is convenient to keep in clientId. John, can you please check if this patch solves the problem (after reverting Jun's patch so that the root cause is reintroduced)?
          Swapnil Ghike made changes -
          Attachment kafka-697-allow-dot.patch [ 12564837 ]
          Hide
          Jun Rao added a comment -

          The new patch looks good to me. We can just apply this patch once John verifies the system tests are ok.

          Show
          Jun Rao added a comment - The new patch looks good to me. We can just apply this patch once John verifies the system tests are ok.
          Jun Rao made changes -
          Link This issue duplicates KAFKA-704 [ KAFKA-704 ]
          Hide
          John Fung added a comment -

          Thanks Swapnil for the latest patch.

          The patch is sanity tested in a single box in which the hostname doesn't have a "." and in a distributed env where the host names have a "." (besides the parent domain name). In both cases, the patch works fine.

          Show
          John Fung added a comment - Thanks Swapnil for the latest patch. The patch is sanity tested in a single box in which the hostname doesn't have a "." and in a distributed env where the host names have a "." (besides the parent domain name). In both cases, the patch works fine.
          Hide
          Neha Narkhede added a comment -

          +1 on the new patch, thanks Swapnil!

          Show
          Neha Narkhede added a comment - +1 on the new patch, thanks Swapnil!
          Hide
          Neha Narkhede added a comment -

          Committed patch to 0.8 branch

          Show
          Neha Narkhede added a comment - Committed patch to 0.8 branch
          Neha Narkhede made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Neha Narkhede made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Hide
          Jun Rao added a comment -

          Thanks for the latest patch. Committed to 0.8 and merged to trunk.

          Show
          Jun Rao added a comment - Thanks for the latest patch. Committed to 0.8 and merged to trunk.

            People

            • Assignee:
              Swapnil Ghike
              Reporter:
              John Fung
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development