Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: None
    • Component/s: producer
    • Labels:
      None

      Description

      KAFKA-1281 (yet to be checked in) converted existing tools to use the new producer. I found that TestLogCleaning hangs while sending messages using the new producer. Following is a thread dump and steps to reproduce the issue.

      nnarkhed-mn1:kafka-git-idea nnarkhed$ ./bin/kafka-run-class.sh kafka.TestLogCleaning --broker localhost:9092 --topics 1 --zk localhost:2181 --messages 100000
      Producing 100000 messages...
      Logging produce requests to /var/folders/61/bspy8z8n1t5dn5sdqzsnhbdr000383/T/kafka-log-cleaner-produced-3744326506335955516.txt
      2014-03-04 10:51:35
      Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.65-b04-462 mixed mode):

      "kafka-network-thread" daemon prio=5 tid=7fc27e94c000 nid=0x10a643000 runnable [10a642000]
      java.lang.Thread.State: RUNNABLE
      at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
      at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:136)
      at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:69)
      at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)

      • locked <7ec0b0170> (a sun.nio.ch.Util$2)
      • locked <7ec0b0180> (a java.util.Collections$UnmodifiableSet)
      • locked <7ec0b0128> (a sun.nio.ch.KQueueSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
        at org.apache.kafka.common.network.Selector.select(Selector.java:296)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:198)
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:153)
        at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:98)
        at java.lang.Thread.run(Thread.java:695)

      "RMI TCP Accept-0" daemon prio=5 tid=7fc27e99c800 nid=0x10a43d000 runnable [10a43c000]
      java.lang.Thread.State: RUNNABLE
      at java.net.PlainSocketImpl.socketAccept(Native Method)
      at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:439)

      • locked <7ec0b6088> (a java.net.SocksSocketImpl)
        at java.net.ServerSocket.implAccept(ServerSocket.java:468)
        at java.net.ServerSocket.accept(ServerSocket.java:436)
        at sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:34)
        at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:369)
        at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341)
        at java.lang.Thread.run(Thread.java:695)

      "AWT-AppKit" daemon prio=5 tid=7fc27f830000 nid=0x7fff7b984180 runnable [00000000]
      java.lang.Thread.State: RUNNABLE

      "Low Memory Detector" daemon prio=5 tid=7fc27e8db000 nid=0x109b30000 runnable [00000000]
      java.lang.Thread.State: RUNNABLE

      "C2 CompilerThread1" daemon prio=9 tid=7fc27e8da800 nid=0x109a2d000 waiting on condition [00000000]
      java.lang.Thread.State: RUNNABLE

      "C2 CompilerThread0" daemon prio=9 tid=7fc27e8d9800 nid=0x10992a000 waiting on condition [00000000]
      java.lang.Thread.State: RUNNABLE

      "Signal Dispatcher" daemon prio=9 tid=7fc27e8d9000 nid=0x109827000 waiting on condition [00000000]
      java.lang.Thread.State: RUNNABLE

      "Surrogate Locker Thread (Concurrent GC)" daemon prio=5 tid=7fc27e002000 nid=0x109724000 waiting on condition [00000000]
      java.lang.Thread.State: RUNNABLE

      "Finalizer" daemon prio=8 tid=7fc27e8d8000 nid=0x109519000 in Object.wait() [109518000]
      java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)

      • waiting on <7ec0b23b0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
      • locked <7ec0b23b0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:171)

      "Reference Handler" daemon prio=10 tid=7fc27e8d7800 nid=0x109416000 in Object.wait() [109415000]
      java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)

      • waiting on <7ec0b4000> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:485)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
      • locked <7ec0b4000> (a java.lang.ref.Reference$Lock)

      "main" prio=5 tid=7fc27e000800 nid=0x102159000 in Object.wait() [102158000]
      java.lang.Thread.State: TIMED_WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)

      • waiting on <7ec0b23e0> (a org.apache.kafka.clients.producer.internals.Metadata)
        at org.apache.kafka.clients.producer.internals.Metadata.fetch(Metadata.java:89)
      • locked <7ec0b23e0> (a org.apache.kafka.clients.producer.internals.Metadata)
        at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:210)
        at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:144)
        at kafka.TestLogCleaning$$anonfun$produceMessages$1.apply$mcVL$sp(TestLogCleaning.scala:260)
        at kafka.TestLogCleaning$$anonfun$produceMessages$1.apply(TestLogCleaning.scala:251)
        at kafka.TestLogCleaning$$anonfun$produceMessages$1.apply(TestLogCleaning.scala:251)
        at scala.collection.immutable.NumericRange.foreach(NumericRange.scala:86)
        at kafka.TestLogCleaning$.produceMessages(TestLogCleaning.scala:251)
        at kafka.TestLogCleaning$.main(TestLogCleaning.scala:113)
        at kafka.TestLogCleaning.main(TestLogCleaning.scala)

      "VM Thread" prio=9 tid=7fc27e8d3000 nid=0x109313000 runnable

      "Gang worker#0 (Parallel GC Threads)" prio=9 tid=7fc27e800000 nid=0x10555b000 runnable

      "Gang worker#1 (Parallel GC Threads)" prio=9 tid=7fc27e801000 nid=0x10565e000 runnable

      "Gang worker#2 (Parallel GC Threads)" prio=9 tid=7fc27e801800 nid=0x105761000 runnable

      "Gang worker#3 (Parallel GC Threads)" prio=9 tid=7fc27e802000 nid=0x105864000 runnable

      "Concurrent Mark-Sweep GC Thread" prio=9 tid=7fc27e87d000 nid=0x108f8a000 runnable
      "VM Periodic Task Thread" prio=10 tid=7fc27e9b8000 nid=0x10a540000 waiting on condition

      "Exception Catcher Thread" prio=10 tid=7fc27e001800 nid=0x102384000 runnable
      JNI global references: 1387

      Heap
      par new generation total 19136K, used 1795K [7eae00000, 7ec2c0000, 7ece00000)
      eden space 17024K, 5% used [7eae00000, 7eaed6a28, 7ebea0000)
      from space 2112K, 44% used [7ec0b0000, 7ec19a5c8, 7ec2c0000)
      to space 2112K, 0% used [7ebea0000, 7ebea0000, 7ec0b0000)
      concurrent mark-sweep generation total 63872K, used 2050K [7ece00000, 7f0c60000, 7fae00000)
      concurrent-mark-sweep perm gen total 21248K, used 13649K [7fae00000, 7fc2c0000, 800000000)

        Activity

        Hide
        Jay Kreps added a comment -

        Seems to have fixed itself.

        Show
        Jay Kreps added a comment - Seems to have fixed itself.
        Hide
        Neha Narkhede added a comment -

        As discussed offline, we are converting all tools over to use the new producer. Regarding this bug though, I can't seem to reproduce it again. May be you can take a look at the thread dump to see if you understand why this is happening.

        Show
        Neha Narkhede added a comment - As discussed offline, we are converting all tools over to use the new producer. Regarding this bug though, I can't seem to reproduce it again. May be you can take a look at the thread dump to see if you understand why this is happening.
        Hide
        Jay Kreps added a comment -

        Hey Jun, why are we converting this?

        Show
        Jay Kreps added a comment - Hey Jun, why are we converting this?

          People

          • Assignee:
            Unassigned
            Reporter:
            Neha Narkhede
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development