Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: core
    • Labels:

      Description

      Currently we create one mbean of each type for a given mbean server, regardless of the number of clients. We should create MBeans per client for both producer and consumer. To do that we need to introduce clientId in mbean names.

      1. kafka-622-v6.patch
        95 kB
        Swapnil Ghike
      2. kafka-622-v5.patch
        95 kB
        Swapnil Ghike
      3. kafka-622-v4.patch
        95 kB
        Swapnil Ghike
      4. kafka-622-v3.patch
        95 kB
        Swapnil Ghike
      5. kafka-622-v2.patch
        85 kB
        Swapnil Ghike
      6. kafka-622-v1.patch
        60 kB
        Swapnil Ghike

        Activity

        Hide
        Swapnil Ghike added a comment -

        The major change in this patch is to convert a bunch of *Stat objects to classes, and instantiating one instance of each such class at the high level Producer, ZookeeperConsumerConnector and AbstractFetcherThread level. This will ensure that clients within the same service will have their own mbeans.

        The mbean name will contain clientId, and the clientId is validated at the high level Producer, ZookeeperConsumerConnector and AbstractFetcherThread level.

        I have set max length of clientId to 200, because it will be appended with some string while creating mbean, and the mbean name might be limited by filename length.

        Other changes are a bunch of import optimizations, and a test case for clientId validation.

        Show
        Swapnil Ghike added a comment - The major change in this patch is to convert a bunch of *Stat objects to classes, and instantiating one instance of each such class at the high level Producer, ZookeeperConsumerConnector and AbstractFetcherThread level. This will ensure that clients within the same service will have their own mbeans. The mbean name will contain clientId, and the clientId is validated at the high level Producer, ZookeeperConsumerConnector and AbstractFetcherThread level. I have set max length of clientId to 200, because it will be appended with some string while creating mbean, and the mbean name might be limited by filename length. Other changes are a bunch of import optimizations, and a test case for clientId validation.
        Hide
        Neha Narkhede added a comment -

        1. AbstractFetcherThread
        1.1 Maybe the clientid validation could happen once on process startup (while instantiating the config) instead of on every fetcher thread. In other words, would it make sense to let the broker startup
        if the config doesn't comply with the allowed values ?
        1.2 How about renaming FetcherLagStat to FetcherLagStats, same for FetcherStat and ConsumerTopicStat and any APIs that are get*Stat.
        1.3 The change to FetcherLagStat.getFetcherLagMetrics serves the purpose but seems slightly hacky. Instead of changing the name of the topic by prepending the clientid to it, what do you think about ch
        anging FetcherLagMetrics take in clientId, (topic, partition) ? The valueFactory can be changed to pass in clientid to the constructor of FetcherLagMetrics.
        1.4 In other parts of code, we've been trying to move away from using a tuple of topic, partition as the key to a map and using TopicPartition instead. The reason being its unsafe since scala doesn't do a good job of doing the right thing on map/fold operations on such a map. Wondering if you were up for making that change in this patch ? If not, then lets file another JIRA to fix that.

        2. ConsumerTopicStat.scala
        2.1 Change the name to ConsumerTopicStats
        2.2 Why does clientId default to empty string ?
        2.3 Same change as 1.3 to ConsumerTopicMetrics
        2.4 I wonder if it makes sense to allow creating ConsumerIterator without passing in ConsumerTopicStat ? ConsumerTopicStat is instantiated in ZookeeperConsumerConnector on startup. The same object need
        s to be passed around everywhere else in the consumer. However, the code is allowing ConsumerTopicStat to be re-instantiated in ConsumerIterator with an empty clientId. Isn't that going to create new m
        etrics objects tracking different values for the same class of metrics ? I guess the right thing is to not provide it a default

        3. DefaultEventHandler
        3.1 Same as 2.4 here for ProducerStats and ProducerTopicStat. 3.2 Let's standardize the naming here. One is ProducerStats and the other is ProducerTopicStat. How about renaming ProducerTopicStat to ProducerTopicStats ?

        4. PartitionTopicInfo
        Same as 2.4 here

        5. Producer
        5.1 What is the point of passing in empty string to ProducerTopicStat ? If that is testing only, how about passing in a clientid like "test-producer". I also think it is better to not change the constr
        uctutor in this way and instead have the test pass in the producerStats object.
        5.2 The clientId is validated after it is passed in to ProducerSendThread. Let's move the clientId validation all the way up in the constructor.
        5.3 Rename getProducer*Stat to getProducer*Stats
        5.4 Same as 1.3 here as well
        5.5 Having a separate AsyncProducerStats object seems a little clunky and is probably a remnant from the time when we explicitly exposed AsyncProducer and SyncProducer as public APIs. I'm guessing movi
        ng this to ProducerStats would be better. That way you have only 2 metrics to worry about on the producer, one is ProducerStats and the other is per-topic stats exposed through ProducerTopicStats

        6. ProducerPool
        6.1 Same as 5.1 here, let's not change the constructor of ProducerPool in this manner, its ugly

        7. ProducerSendThread
        7.1 Let's not allow empty clientId as the default value. It should always be passed in correctly.

        8. SimpleConsumer
        8.1 I wonder what is the value of letting one instantiate a FetchRequestAndResponseStat with an empty clientId ?
        8.2 Instead of adding a stat object to the constructor of SimpleConsumer, let's just add the clientId. It seems unnatural to have the user of SimpleConsumer to know how to instantiate the stat object.
        On the other hand, the purpose of clientId might be much easier to explain and understand.
        8.3 Let's not change the static SimpleConsumer APIs to have the user pass in a stat object. These are fire and forget queries. At best, let's have the user pass in clientId or default it to be the host
        name or something meaningful but simple. ZookeeperConsumerConnector will pass in the right value for clientId

        9. SyncProducer
        9.1 Same as 8.2. Let's not change the constructor in this manner
        9.2 Rename ProducerRequestStat to ProducerRequestStats

        10. Is TopicTest deleted on purpose ?

        Show
        Neha Narkhede added a comment - 1. AbstractFetcherThread 1.1 Maybe the clientid validation could happen once on process startup (while instantiating the config) instead of on every fetcher thread. In other words, would it make sense to let the broker startup if the config doesn't comply with the allowed values ? 1.2 How about renaming FetcherLagStat to FetcherLagStats, same for FetcherStat and ConsumerTopicStat and any APIs that are get*Stat. 1.3 The change to FetcherLagStat.getFetcherLagMetrics serves the purpose but seems slightly hacky. Instead of changing the name of the topic by prepending the clientid to it, what do you think about ch anging FetcherLagMetrics take in clientId, (topic, partition) ? The valueFactory can be changed to pass in clientid to the constructor of FetcherLagMetrics. 1.4 In other parts of code, we've been trying to move away from using a tuple of topic, partition as the key to a map and using TopicPartition instead. The reason being its unsafe since scala doesn't do a good job of doing the right thing on map/fold operations on such a map. Wondering if you were up for making that change in this patch ? If not, then lets file another JIRA to fix that. 2. ConsumerTopicStat.scala 2.1 Change the name to ConsumerTopicStats 2.2 Why does clientId default to empty string ? 2.3 Same change as 1.3 to ConsumerTopicMetrics 2.4 I wonder if it makes sense to allow creating ConsumerIterator without passing in ConsumerTopicStat ? ConsumerTopicStat is instantiated in ZookeeperConsumerConnector on startup. The same object need s to be passed around everywhere else in the consumer. However, the code is allowing ConsumerTopicStat to be re-instantiated in ConsumerIterator with an empty clientId. Isn't that going to create new m etrics objects tracking different values for the same class of metrics ? I guess the right thing is to not provide it a default 3. DefaultEventHandler 3.1 Same as 2.4 here for ProducerStats and ProducerTopicStat. 3.2 Let's standardize the naming here. One is ProducerStats and the other is ProducerTopicStat. How about renaming ProducerTopicStat to ProducerTopicStats ? 4. PartitionTopicInfo Same as 2.4 here 5. Producer 5.1 What is the point of passing in empty string to ProducerTopicStat ? If that is testing only, how about passing in a clientid like "test-producer". I also think it is better to not change the constr uctutor in this way and instead have the test pass in the producerStats object. 5.2 The clientId is validated after it is passed in to ProducerSendThread. Let's move the clientId validation all the way up in the constructor. 5.3 Rename getProducer*Stat to getProducer*Stats 5.4 Same as 1.3 here as well 5.5 Having a separate AsyncProducerStats object seems a little clunky and is probably a remnant from the time when we explicitly exposed AsyncProducer and SyncProducer as public APIs. I'm guessing movi ng this to ProducerStats would be better. That way you have only 2 metrics to worry about on the producer, one is ProducerStats and the other is per-topic stats exposed through ProducerTopicStats 6. ProducerPool 6.1 Same as 5.1 here, let's not change the constructor of ProducerPool in this manner, its ugly 7. ProducerSendThread 7.1 Let's not allow empty clientId as the default value. It should always be passed in correctly. 8. SimpleConsumer 8.1 I wonder what is the value of letting one instantiate a FetchRequestAndResponseStat with an empty clientId ? 8.2 Instead of adding a stat object to the constructor of SimpleConsumer, let's just add the clientId. It seems unnatural to have the user of SimpleConsumer to know how to instantiate the stat object. On the other hand, the purpose of clientId might be much easier to explain and understand. 8.3 Let's not change the static SimpleConsumer APIs to have the user pass in a stat object. These are fire and forget queries. At best, let's have the user pass in clientId or default it to be the host name or something meaningful but simple. ZookeeperConsumerConnector will pass in the right value for clientId 9. SyncProducer 9.1 Same as 8.2. Let's not change the constructor in this manner 9.2 Rename ProducerRequestStat to ProducerRequestStats 10. Is TopicTest deleted on purpose ?
        Hide
        Swapnil Ghike added a comment -

        1.2, 2.1, 3.2, 5.3, 9.2: Renamed *Stat to *Stats and get*Stat to get*Stats. Looks better.

        1.3, 2.3, 5.4: Yes, I agree. Changed.

        2.2, 2.4, 3.1, 4, 5.1, 6.1, 7.1, 8.1: I was not sure if I should modify a whole bunch of unit tests or provide defaults in constructors. The v2 patch touches a whole bunch of unit tests and some tools.

        1.1 Yes, clientId is validated in ZookeeperConsumerConnector which is eventually passed to ConsumerFetcherThread (which extends AbstractFetcherThread). We control the clientId for ReplicaManagerThread (which also extends AbstractFetcherThread) in code, so I guess we can safely remove clientId validation from AbstractFetcherThread.
        1.4 I can file another Jira to take a look at this.

        5.2 Yes, my bad. To correct this, I had to hack around the Producer constructor. The same *Stats objects need to be passed to DefaultEventHandler and Producer, and they can be constructed only after validating the clientId. Scala does not allow you to put any statement (ClientId.validate(clientId) in our case) before calling a (n-1) level constructor, so you need to put the clientId validation in a block and return a quadruplet from the block which is evaluated as the constructor argument. This constructor acts as an intermediary to separate out the elements of the quadruplet. Suggestions are welcome.
        5.5 Moved DroppedMessagePerSec to ProducerStats and deleted AsyncProducerStats.

        10 That was because I was trying to modify and rename a file at the same time, and svn has issues cleanly applying such patch. On a second thought, separated TopicTest and ClientIdTest. This patch contains deletion of ConsumerTopicStat.scala and addition of ConsumerTopicStats.scala for the same reason.

        Change of plan:
        8.2, 8.3, 9.1 Now, each SyncProducer and SimpleConsumer will register its own ProducerRequestStats/ConsumerRequestStats. That removes the need to pass a *Stats object via their constructors. Aggregation of ProducerRequestStats/ConsumerRequestStats at the high level Producer or ZookeeperConsumerConnector level will probably require an aggregator object to be passed to the constructors of SyncProducer/SimpleConsumer. So, I have left that part out.

        Additional things:
        a. Not quite sure what should go as the clientId in KafkaETLContext.
        b. Overloaded the static createSyncProducer API. The two separate APIs remove the need to use Option and also make it easy to pass "FetchMetadata" as the clientId from ClientUtils.
        c. Renamed kafka.controller.ControllerStat to ControllerStats and kafka.server.BrokerTopicStat to BrokerTopicStats.

        Show
        Swapnil Ghike added a comment - 1.2, 2.1, 3.2, 5.3, 9.2: Renamed *Stat to *Stats and get*Stat to get*Stats. Looks better. 1.3, 2.3, 5.4: Yes, I agree. Changed. 2.2, 2.4, 3.1, 4, 5.1, 6.1, 7.1, 8.1: I was not sure if I should modify a whole bunch of unit tests or provide defaults in constructors. The v2 patch touches a whole bunch of unit tests and some tools. 1.1 Yes, clientId is validated in ZookeeperConsumerConnector which is eventually passed to ConsumerFetcherThread (which extends AbstractFetcherThread). We control the clientId for ReplicaManagerThread (which also extends AbstractFetcherThread) in code, so I guess we can safely remove clientId validation from AbstractFetcherThread. 1.4 I can file another Jira to take a look at this. 5.2 Yes, my bad. To correct this, I had to hack around the Producer constructor. The same *Stats objects need to be passed to DefaultEventHandler and Producer, and they can be constructed only after validating the clientId. Scala does not allow you to put any statement (ClientId.validate(clientId) in our case) before calling a (n-1) level constructor, so you need to put the clientId validation in a block and return a quadruplet from the block which is evaluated as the constructor argument. This constructor acts as an intermediary to separate out the elements of the quadruplet. Suggestions are welcome. 5.5 Moved DroppedMessagePerSec to ProducerStats and deleted AsyncProducerStats. 10 That was because I was trying to modify and rename a file at the same time, and svn has issues cleanly applying such patch. On a second thought, separated TopicTest and ClientIdTest. This patch contains deletion of ConsumerTopicStat.scala and addition of ConsumerTopicStats.scala for the same reason. Change of plan: 8.2, 8.3, 9.1 Now, each SyncProducer and SimpleConsumer will register its own ProducerRequestStats/ConsumerRequestStats. That removes the need to pass a *Stats object via their constructors. Aggregation of ProducerRequestStats/ConsumerRequestStats at the high level Producer or ZookeeperConsumerConnector level will probably require an aggregator object to be passed to the constructors of SyncProducer/SimpleConsumer. So, I have left that part out. Additional things: a. Not quite sure what should go as the clientId in KafkaETLContext. b. Overloaded the static createSyncProducer API. The two separate APIs remove the need to use Option and also make it easy to pass "FetchMetadata" as the clientId from ClientUtils. c. Renamed kafka.controller.ControllerStat to ControllerStats and kafka.server.BrokerTopicStat to BrokerTopicStats.
        Hide
        Neha Narkhede added a comment -

        Thanks for incorporating suggestions in the first review!

        1. AbstractFetcherThread.scala

        FetcherLagMetrics now takes in a nested tuples. Scala's support for tuples makes it easier to express short lived data structures that need to be paired conveniently, but they tend to reduce code readability. In addition to this, we've been bitten by incomplete Scala semantics when comparing two tuples or using them for some of the functional programming collections APIs. Over time, as we understood this more, we now encourage use of tuples only for expressing short lived objects within maybe the same API. We haven't changed all the past usages to conform to this yet, but we do hope we can avoid this in the future usage. Having said that, I think here it is a good idea to change this to accept two arguments - String and TopicAndPartition.

        2. ClientUtils

        Both producers and consumers use this API to fetch metadata and it seems like we have hardcoded the clientId. The clientId to create sync producer should be the one that the consumer/producer is instantiated with.

        3. Producer.scala

        3.1 I think you are validating the clientId twice in the constructor that takes in just the ProducerConfig.
        3.2 Maybe this was done for convenience, what is the reason that the constructor argument to ProducerTopicMetrics is a tuple vs 2 separate strings ?
        3.3 We definitely want to avoid this -
        private val stats = new Pool[(String, String), ProducerTopicMetrics](Some(valueFactory))
        Change this to use a case class TopicAndClientId here as well as ConsumerTopicStats.scala.

        4. ProducerPool.scala

        createSyncProducer is overloaded to create 2 very similar but confusing methods. One takes Broker as the second argument, the other takes it as the first argument. I see why you need two, but it will be great to add some docs to explain where each of those is used.

        5. ProducerSendThread.scala

        There needs to be a delimiter ("-") between the clientId and ProducerQueueSize

        6. ProducerConfig

        Not sure which patch introduced this property, but it is inconsistent with the property naming conventions. It happens to be the only one that uses an underscore in the property name. Besides, the one in the consumer is called just clientid. Let's please change this to clientid.

        val clientId = props.getString("producer.request.client_id",SyncProducerConfig.DefaultClientId)

        7. ZookeeperConsumerConnector

        It will be good to have some summary stats here that aggregates from individual simple consumers. Often users wouldn't want to do this math themselves.

        8. What happened to ConsumerTopicStats ?

        Show
        Neha Narkhede added a comment - Thanks for incorporating suggestions in the first review! 1. AbstractFetcherThread.scala FetcherLagMetrics now takes in a nested tuples. Scala's support for tuples makes it easier to express short lived data structures that need to be paired conveniently, but they tend to reduce code readability. In addition to this, we've been bitten by incomplete Scala semantics when comparing two tuples or using them for some of the functional programming collections APIs. Over time, as we understood this more, we now encourage use of tuples only for expressing short lived objects within maybe the same API. We haven't changed all the past usages to conform to this yet, but we do hope we can avoid this in the future usage. Having said that, I think here it is a good idea to change this to accept two arguments - String and TopicAndPartition. 2. ClientUtils Both producers and consumers use this API to fetch metadata and it seems like we have hardcoded the clientId. The clientId to create sync producer should be the one that the consumer/producer is instantiated with. 3. Producer.scala 3.1 I think you are validating the clientId twice in the constructor that takes in just the ProducerConfig. 3.2 Maybe this was done for convenience, what is the reason that the constructor argument to ProducerTopicMetrics is a tuple vs 2 separate strings ? 3.3 We definitely want to avoid this - private val stats = new Pool [(String, String), ProducerTopicMetrics] (Some(valueFactory)) Change this to use a case class TopicAndClientId here as well as ConsumerTopicStats.scala. 4. ProducerPool.scala createSyncProducer is overloaded to create 2 very similar but confusing methods. One takes Broker as the second argument, the other takes it as the first argument. I see why you need two, but it will be great to add some docs to explain where each of those is used. 5. ProducerSendThread.scala There needs to be a delimiter ("-") between the clientId and ProducerQueueSize 6. ProducerConfig Not sure which patch introduced this property, but it is inconsistent with the property naming conventions. It happens to be the only one that uses an underscore in the property name. Besides, the one in the consumer is called just clientid. Let's please change this to clientid. val clientId = props.getString("producer.request.client_id",SyncProducerConfig.DefaultClientId) 7. ZookeeperConsumerConnector It will be good to have some summary stats here that aggregates from individual simple consumers. Often users wouldn't want to do this math themselves. 8. What happened to ConsumerTopicStats ?
        Hide
        Swapnil Ghike added a comment -

        1. I see your point, that's good to know. Now using a case class with three constructor arguments. Using two separate arguments String and TopicAndPartition would mean that the key of the stats Pool will be a touple.
        2. Yup.
        3.1 Yes, the primary constructor is only used for unit testing. So we can skip clientId validation.
        3.2, 3.3 using a case class object to avoid using a tuple as the constructor argument.
        4. Added a comment above both methods, also changed the order of arguments so that broker is the second argument in both methods.
        5. Aye.
        6. Changed it to clientId = props.getString("clientid",SyncProducerConfig.DefaultClientId)
        7. Aggregation of ConsumerRequestStats at the ZookeeperConsumerConnector level will need us to pass an aggregator object to the constructor of SimpleConsumer. Then we will either need to provide this aggregator object a default value in the constructor because we can't expect the users of SimpleConsumer to know how to instantiate that object. Thoughts?
        8. ConsumerTopicStat.scala was modified and renamed to ConsumerTopicStats.scala. Due to svn issues, deleted the old file, added the modified file as a new file.

        Show
        Swapnil Ghike added a comment - 1. I see your point, that's good to know. Now using a case class with three constructor arguments. Using two separate arguments String and TopicAndPartition would mean that the key of the stats Pool will be a touple. 2. Yup. 3.1 Yes, the primary constructor is only used for unit testing. So we can skip clientId validation. 3.2, 3.3 using a case class object to avoid using a tuple as the constructor argument. 4. Added a comment above both methods, also changed the order of arguments so that broker is the second argument in both methods. 5. Aye. 6. Changed it to clientId = props.getString("clientid",SyncProducerConfig.DefaultClientId) 7. Aggregation of ConsumerRequestStats at the ZookeeperConsumerConnector level will need us to pass an aggregator object to the constructor of SimpleConsumer. Then we will either need to provide this aggregator object a default value in the constructor because we can't expect the users of SimpleConsumer to know how to instantiate that object. Thoughts? 8. ConsumerTopicStat.scala was modified and renamed to ConsumerTopicStats.scala. Due to svn issues, deleted the old file, added the modified file as a new file.
        Hide
        Neha Narkhede added a comment -

        v3 looks much better and almost ready for check in. Few minor cleanup comments -

        9. Minor naming convention suggestion - Probably better to rename ClientIdTopicPartition to ClientIdAndPartition or ClientIdAndTopicPartition
        10. ProducerPool Minor documentation nitpick - The API docs for createSyncProducer doesn't match the commenting style for documenting APIs. We use C style comments while this patch has added C++ style comments.
        11. SimpleConsumerPerformance - Remove unused import FetchRequestAndResponseStats
        12. BrokerPartitionInfo - Not introduced by your patch, but will be good to fix. Typo in the name of the param. Change it to topics instead of topic
        13.1 Producer - Again not introduced by your patch, just happened to notice it right now. Typo in the name of the param in send API docs. It says producerData now but should be messages
        13.2 Minor nitpick - Rest of the metrics classes have name as their constructor argument, except ProducerTopicMetrics which says tuple and its not a tuple.
        14. SyncProducerConfig - Elsewhere, we don't use camel case for config names. Probably best to change it to clientid like in ConsumerConfig
        15. What happened to Topic.scala ?

        7. For now, it seems that the messages rate, both per topic and global should suffice. We can, however, think about how/if the aggregation metrics for fetch response rate and size makes sense at the ZookeeperConsumerConnector level. But I d
        on't think we should hold up this patch for it. Please can you file a JIRA to track that ?

        Show
        Neha Narkhede added a comment - v3 looks much better and almost ready for check in. Few minor cleanup comments - 9. Minor naming convention suggestion - Probably better to rename ClientIdTopicPartition to ClientIdAndPartition or ClientIdAndTopicPartition 10. ProducerPool Minor documentation nitpick - The API docs for createSyncProducer doesn't match the commenting style for documenting APIs. We use C style comments while this patch has added C++ style comments. 11. SimpleConsumerPerformance - Remove unused import FetchRequestAndResponseStats 12. BrokerPartitionInfo - Not introduced by your patch, but will be good to fix. Typo in the name of the param. Change it to topics instead of topic 13.1 Producer - Again not introduced by your patch, just happened to notice it right now. Typo in the name of the param in send API docs. It says producerData now but should be messages 13.2 Minor nitpick - Rest of the metrics classes have name as their constructor argument, except ProducerTopicMetrics which says tuple and its not a tuple. 14. SyncProducerConfig - Elsewhere, we don't use camel case for config names. Probably best to change it to clientid like in ConsumerConfig 15. What happened to Topic.scala ? 7. For now, it seems that the messages rate, both per topic and global should suffice. We can, however, think about how/if the aggregation metrics for fetch response rate and size makes sense at the ZookeeperConsumerConnector level. But I d on't think we should hold up this patch for it. Please can you file a JIRA to track that ?
        Hide
        Swapnil Ghike added a comment -

        9 - I am not convinced that "ClientIdAndTopicPartition" is the right way to name it, because the class has 3 separate constructor arguments and it does not use a TopicAndPartition constructor argument. Also I did not want to include too many "And"s in the name. Let me know what you think and I can change it accordingly.
        Fixed 10, 11, 12, 13.1.
        13.2 - name was used to refer to the topic name. Replaced it with a more straightforward clientIdTopic of the type ClientIdAndTopic.
        14. A more uniform scheme would be to use clientId = "client.id" in both SyncProducerConfigShared and ConsumerConfig. This way we don't have any confusion in using $word+"id", or $word+"_id", or $word+"Id". Similarly, we could change groupid, consumerid in ConsumerConfig and producer.request.correlation_id in SyncProducerConfigShared. I haven't changed them in this patch because changing groupid might have impact on any tools/tests that are creating consumers. Should we change them in another/this jira?
        15. Grouped object ClientId, object Topic and case class ClientIdAndTopic together in one file ClientIdAndTopic.scala.
        7. Ok.

        Show
        Swapnil Ghike added a comment - 9 - I am not convinced that "ClientIdAndTopicPartition" is the right way to name it, because the class has 3 separate constructor arguments and it does not use a TopicAndPartition constructor argument. Also I did not want to include too many "And"s in the name. Let me know what you think and I can change it accordingly. Fixed 10, 11, 12, 13.1. 13.2 - name was used to refer to the topic name. Replaced it with a more straightforward clientIdTopic of the type ClientIdAndTopic. 14. A more uniform scheme would be to use clientId = "client.id" in both SyncProducerConfigShared and ConsumerConfig. This way we don't have any confusion in using $word+"id", or $word+"_id", or $word+"Id". Similarly, we could change groupid, consumerid in ConsumerConfig and producer.request.correlation_id in SyncProducerConfigShared. I haven't changed them in this patch because changing groupid might have impact on any tools/tests that are creating consumers. Should we change them in another/this jira? 15. Grouped object ClientId, object Topic and case class ClientIdAndTopic together in one file ClientIdAndTopic.scala. 7. Ok.
        Hide
        Swapnil Ghike added a comment -

        Quick fix to interchange the order of constructor arguments of ClientIdAndTopic.

        Show
        Swapnil Ghike added a comment - Quick fix to interchange the order of constructor arguments of ClientIdAndTopic.
        Hide
        Swapnil Ghike added a comment -

        Oops, another quick fix. Didn't realize that syncproducer was not getting clientId while fetching metadata.

        Show
        Swapnil Ghike added a comment - Oops, another quick fix. Didn't realize that syncproducer was not getting clientId while fetching metadata.
        Hide
        Neha Narkhede added a comment -

        9. Sounds good.
        14. That's a good suggestion but bigger change and backwards incompatible as well. Let's think it through more. For now, let's leave clientid as it is.

        +1

        Show
        Neha Narkhede added a comment - 9. Sounds good. 14. That's a good suggestion but bigger change and backwards incompatible as well. Let's think it through more. For now, let's leave clientid as it is. +1
        Hide
        Neha Narkhede added a comment -

        Committed patch v6 after making the following minor changes-
        1. Changed the config name to clientid until we change the naming convention across the board
        2. Removed unused variables from SyncProducer

        Please file the JIRAs to track the metrics aggregation and the config naming convention.

        Show
        Neha Narkhede added a comment - Committed patch v6 after making the following minor changes- 1. Changed the config name to clientid until we change the naming convention across the board 2. Removed unused variables from SyncProducer Please file the JIRAs to track the metrics aggregation and the config naming convention.
        Hide
        Jun Rao added a comment -

        Sorry for the late review. Have a few minor questions/comments.

        60. What happens if have 2 instances of Consumers with the same clientid in the same jvm? Does one of them fail because it fails to register metrics? Ditto for Producers.

        61. ConsumerTopicStats: What if a topic is named AllTopics? We use to handle this by adding a - in topic specific stats.

        62. ZookeeperConsumerConnector: Do we need to validate groupid?

        63. ClientId: Does the clientid length need to be different from topic length?

        64. AbstractFetcherThread: When building a fetch request, do we need to pass in brokerInfo as part of the client id? BrokerInfo contains the source broker info and the fetch requests are always made to the source broker.

        Show
        Jun Rao added a comment - Sorry for the late review. Have a few minor questions/comments. 60. What happens if have 2 instances of Consumers with the same clientid in the same jvm? Does one of them fail because it fails to register metrics? Ditto for Producers. 61. ConsumerTopicStats: What if a topic is named AllTopics? We use to handle this by adding a - in topic specific stats. 62. ZookeeperConsumerConnector: Do we need to validate groupid? 63. ClientId: Does the clientid length need to be different from topic length? 64. AbstractFetcherThread: When building a fetch request, do we need to pass in brokerInfo as part of the client id? BrokerInfo contains the source broker info and the fetch requests are always made to the source broker.
        Hide
        Swapnil Ghike added a comment -

        Created KAFKA-646 to address your comments.

        Show
        Swapnil Ghike added a comment - Created KAFKA-646 to address your comments.

          People

          • Assignee:
            Swapnil Ghike
            Reporter:
            Swapnil Ghike
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development