Kafka
  1. Kafka
  2. KAFKA-718

kafka-run-class.sh should use reasonable gc settings

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None

      Description

      Our start script seems to use the default "stop the world" collector. It would be good to default to well tuned gc settings including gc logging, CMS, etc. Whatever we are using in prod and perf lab...

      Many people who want to use kafka basically don't know java well so they won't succeed in figuring this stuff out on their own and just think it is broken and timing out if we don't have good defaults.

      1. 718-v1.patch
        4 kB
        Ashwanth Fernando
      2. KAFKA-718-v2.patch
        9 kB
        Jay Kreps
      3. KAFKA-718-v3.patch
        12 kB
        Jay Kreps
      4. KAFKA-718-v4.patch
        20 kB
        Jay Kreps
      5. KAFKA-718-v5.patch
        24 kB
        Jay Kreps

        Activity

        Jay Kreps created issue -
        Neha Narkhede made changes -
        Field Original Value New Value
        Assignee Neha Narkhede [ nehanarkhede ]
        Hide
        Ashwanth Fernando added a comment -

        Hi, This bug seems to be open for a long time. I can take this up if its not been paid attention.

        Show
        Ashwanth Fernando added a comment - Hi, This bug seems to be open for a long time. I can take this up if its not been paid attention.
        Hide
        Jay Kreps added a comment -

        That would be great.

        Show
        Jay Kreps added a comment - That would be great.
        Hide
        Ashwanth Fernando added a comment -

        I checked out 0.8, committed my changes and while rebasing with the trunk (which I believe holds 0.7), there are a whole lot of inbound patches which have conflicts with 0.8 code. Do I try to manual merge these conflicts? I have no clue what those patches are.

        Followed the simple contributor workflow here - https://cwiki.apache.org/confluence/display/KAFKA/Git+Workflow except that for "git checkout -b xyz remotes/origin/trunk" I did git checkout -b 718 remotes/origin/0.8

        Show
        Ashwanth Fernando added a comment - I checked out 0.8, committed my changes and while rebasing with the trunk (which I believe holds 0.7), there are a whole lot of inbound patches which have conflicts with 0.8 code. Do I try to manual merge these conflicts? I have no clue what those patches are. Followed the simple contributor workflow here - https://cwiki.apache.org/confluence/display/KAFKA/Git+Workflow except that for "git checkout -b xyz remotes/origin/trunk" I did git checkout -b 718 remotes/origin/0.8
        Hide
        Jun Rao added a comment -

        Since this is not critical for 0.8, I suggest that we make the patch for trunk.

        Show
        Jun Rao added a comment - Since this is not critical for 0.8, I suggest that we make the patch for trunk.
        Hide
        Ashwanth Fernando added a comment -

        ok. I will do that and get back. Tx.

        Show
        Ashwanth Fernando added a comment - ok. I will do that and get back. Tx.
        Hide
        Jay Kreps added a comment -

        Hey Jun, this should be a trivial change and will likely catch a lot of people (most people don't check the gc settings). GC leads to all kinds of irritiating problems so i think it would be good to get this in 0.8 if possible.

        Show
        Jay Kreps added a comment - Hey Jun, this should be a trivial change and will likely catch a lot of people (most people don't check the gc settings). GC leads to all kinds of irritiating problems so i think it would be good to get this in 0.8 if possible.
        Hide
        Ashwanth Fernando added a comment -

        The change is trivial. I guess where I am having problems in sending out a patch is that I checked out 0.8, made my changes, committed to my local git repo. While rebasing with the trunk, there are a whole lot of inbound patches that need to be added to 0.8 branch. Do I merge these patches, because I don't feel comfortable dealing with the conflicts, because these patches are not authored by me.

        Or, do I rebase with 0.8 instead of the trunk, if I am working on 0.8?

        Show
        Ashwanth Fernando added a comment - The change is trivial. I guess where I am having problems in sending out a patch is that I checked out 0.8, made my changes, committed to my local git repo. While rebasing with the trunk, there are a whole lot of inbound patches that need to be added to 0.8 branch. Do I merge these patches, because I don't feel comfortable dealing with the conflicts, because these patches are not authored by me. Or, do I rebase with 0.8 instead of the trunk, if I am working on 0.8?
        Hide
        Ashwanth Fernando added a comment - - edited

        Eitherways, I have a patch that cleanly applies on trunk.

        Basically the kafka-run-class.sh has a lot of .sh clients which use it. A majority of these clients are non-daemon and a couple (zk and kafkaServer) are daemons. For the non daemons, we don't need gc logs so I have the following settings:

        -XX:+AggressiveOpts -XX:+UseCompressedOops -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark -XX:+DisableExplicitGC

        For the daemon processes (zk and kafkaServer) I have the above options set + the options to emit GC Logs.

        -Xloggc:$base_dir/$GC_LOG_FILE_NAME -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps

        Where GC_LOG_FILE_NAME is either zookeeper_gc.log or kafkaServer_gc.log depending on which is the daemon that is run. This is done, so that when developers run both zk and kafkaServer on the same filesystem, the two processes don't clobber the same file.

        Show
        Ashwanth Fernando added a comment - - edited Eitherways, I have a patch that cleanly applies on trunk. Basically the kafka-run-class.sh has a lot of .sh clients which use it. A majority of these clients are non-daemon and a couple (zk and kafkaServer) are daemons. For the non daemons, we don't need gc logs so I have the following settings: -XX:+AggressiveOpts -XX:+UseCompressedOops -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark -XX:+DisableExplicitGC For the daemon processes (zk and kafkaServer) I have the above options set + the options to emit GC Logs. -Xloggc:$base_dir/$GC_LOG_FILE_NAME -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps Where GC_LOG_FILE_NAME is either zookeeper_gc.log or kafkaServer_gc.log depending on which is the daemon that is run. This is done, so that when developers run both zk and kafkaServer on the same filesystem, the two processes don't clobber the same file.
        Ashwanth Fernando made changes -
        Attachment 718-v1.patch [ 12591330 ]
        Ashwanth Fernando made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Affects Version/s 0.8 [ 12317244 ]
        Hide
        Jay Kreps added a comment -

        Hey Ashwanth, thanks for the patch.

        I applied it to 0.8 and I want to abuse this ticket to include a bunch of trivial "good housekeeping" fixes I think we need for the 0.8 final release that probably don't warrant a ticket of thier own.

        This includes:
        1. Your gc settings, but minus the +AggressiveOpts which doesn't seem to work on all versions of java 1.6
        2. Clean up the README which has a number of outdated things in it.
        3. Remove unused properties from server.properties to fix WARN messages.
        4. Change to use log.dirs in the example server config so people know about multiple log directories
        5. Remove incorrect description of flush.interval in example server config.

        Show
        Jay Kreps added a comment - Hey Ashwanth, thanks for the patch. I applied it to 0.8 and I want to abuse this ticket to include a bunch of trivial "good housekeeping" fixes I think we need for the 0.8 final release that probably don't warrant a ticket of thier own. This includes: 1. Your gc settings, but minus the +AggressiveOpts which doesn't seem to work on all versions of java 1.6 2. Clean up the README which has a number of outdated things in it. 3. Remove unused properties from server.properties to fix WARN messages. 4. Change to use log.dirs in the example server config so people know about multiple log directories 5. Remove incorrect description of flush.interval in example server config.
        Jay Kreps made changes -
        Attachment KAFKA-718-v2.patch [ 12591911 ]
        Hide
        Jay Kreps added a comment -

        Separate out a few more command line variables (GC, GC logging, SCALA_VERSION, etc).

        Also remove delete_topic.sh command since that doesn't work yet.

        Show
        Jay Kreps added a comment - Separate out a few more command line variables (GC, GC logging, SCALA_VERSION, etc). Also remove delete_topic.sh command since that doesn't work yet.
        Jay Kreps made changes -
        Attachment KAFKA-718-v3.patch [ 12591933 ]
        Hide
        Ashwanth Fernando added a comment - - edited

        Thanks Jay !! Can we mark this as resolved or do you have some more?

        Show
        Ashwanth Fernando added a comment - - edited Thanks Jay !! Can we mark this as resolved or do you have some more?
        Hide
        Jay Kreps added a comment -

        Since 0.8 is somewhat locked down I am just waiting on close sanity checking from a few others before I check in. I'll close it up then.

        Show
        Jay Kreps added a comment - Since 0.8 is somewhat locked down I am just waiting on close sanity checking from a few others before I check in. I'll close it up then.
        Hide
        Jay Kreps added a comment -

        New patch. Rebased. Fixed logging in tools so that we no longer spew crazy log messages intermingled with the tool output.

        Show
        Jay Kreps added a comment - New patch. Rebased. Fixed logging in tools so that we no longer spew crazy log messages intermingled with the tool output.
        Jay Kreps made changes -
        Attachment KAFKA-718-v4.patch [ 12595759 ]
        Hide
        Jay Kreps added a comment -

        Patch v5: move all logs to a logs/ directory to avoid polluting the main directory with a half dozen rolling logs.

        Show
        Jay Kreps added a comment - Patch v5: move all logs to a logs/ directory to avoid polluting the main directory with a half dozen rolling logs.
        Jay Kreps made changes -
        Attachment KAFKA-718-v5.patch [ 12595831 ]
        Hide
        Jun Rao added a comment -

        Thanks for patch v4. Got the following exception running kafka-console-producer.sh (ditto for kafka-console-consumer.sh)

        bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
        log4j:ERROR Could not read configuration file from URL file:bin/../config/tools-log4j.properties.
        java.io.FileNotFoundException: bin/../config/tools-log4j.properties (No such file or directory)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:120)
        at java.io.FileInputStream.<init>(FileInputStream.java:79)
        at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:70)
        at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:161)
        at java.net.URL.openStream(URL.java:1010)
        at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:459)
        at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:471)
        at org.apache.log4j.LogManager.<clinit>(LogManager.java:125)
        at org.apache.log4j.Logger.getLogger(Logger.java:105)
        at kafka.utils.Logging$class.logger(Logging.scala:24)
        at kafka.utils.VerifiableProperties.logger(VerifiableProperties.scala:23)
        at kafka.utils.Logging$class.info(Logging.scala:66)
        at kafka.utils.VerifiableProperties.info(VerifiableProperties.scala:23)
        at kafka.utils.VerifiableProperties.verify(VerifiableProperties.scala:180)
        at kafka.producer.ProducerConfig.<init>(ProducerConfig.scala:57)
        at kafka.producer.ConsoleProducer$.main(ConsoleProducer.scala:147)
        at kafka.producer.ConsoleProducer.main(ConsoleProducer.scala)

        Show
        Jun Rao added a comment - Thanks for patch v4. Got the following exception running kafka-console-producer.sh (ditto for kafka-console-consumer.sh) bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test log4j:ERROR Could not read configuration file from URL file:bin/../config/tools-log4j.properties . java.io.FileNotFoundException: bin/../config/tools-log4j.properties (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:120) at java.io.FileInputStream.<init>(FileInputStream.java:79) at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:70) at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:161) at java.net.URL.openStream(URL.java:1010) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:459) at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:471) at org.apache.log4j.LogManager.<clinit>(LogManager.java:125) at org.apache.log4j.Logger.getLogger(Logger.java:105) at kafka.utils.Logging$class.logger(Logging.scala:24) at kafka.utils.VerifiableProperties.logger(VerifiableProperties.scala:23) at kafka.utils.Logging$class.info(Logging.scala:66) at kafka.utils.VerifiableProperties.info(VerifiableProperties.scala:23) at kafka.utils.VerifiableProperties.verify(VerifiableProperties.scala:180) at kafka.producer.ProducerConfig.<init>(ProducerConfig.scala:57) at kafka.producer.ConsoleProducer$.main(ConsoleProducer.scala:147) at kafka.producer.ConsoleProducer.main(ConsoleProducer.scala)
        Hide
        Jay Kreps added a comment -

        Sorry, I had missed that file in v4. I had fixed it in v5, can you try that?

        Show
        Jay Kreps added a comment - Sorry, I had missed that file in v4. I had fixed it in v5, can you try that?
        Hide
        Jun Rao added a comment -

        +1 on v5.

        Show
        Jun Rao added a comment - +1 on v5.
        Hide
        Jay Kreps added a comment -

        committed

        Show
        Jay Kreps added a comment - committed
        Jay Kreps made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Assignee Neha Narkhede [ nehanarkhede ] Jay Kreps [ jkreps ]
        Fix Version/s 0.8 [ 12317244 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            Jay Kreps
            Reporter:
            Jay Kreps
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development