Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.0.0, 1.2.0
-
Using Storm 1.2.0 preview binaries shared by Stig Rohde Døssing & Jungtaek Lim through the "[Discuss] Release Storm 1.2.0" discussion is Storm Developer's mailing list
With one Nimbus Vm, 6 Supervisor VMs, 3 Zookeeper VMs, 15 topologies, talking with a 5 VMs Kafka Brokers set (based on Kafka 0.10.2), all with ORACLE Server JRE 8 update 152.
About 15 topologies, handling around 1 million Kafka messages per minute, and connected to Redis, OpenTSDB & HBase.Using Storm 1.2.0 preview binaries shared by Stig Rohde Døssing & Jungtaek Lim through the "[Discuss] Release Storm 1.2.0" discussion is Storm Developer's mailing list With one Nimbus Vm, 6 Supervisor VMs, 3 Zookeeper VMs, 15 topologies, talking with a 5 VMs Kafka Brokers set (based on Kafka 0.10.2), all with ORACLE Server JRE 8 update 152. About 15 topologies, handling around 1 million Kafka messages per minute, and connected to Redis, OpenTSDB & HBase.
Description
Hello,
We have been running Storm 1.2.0 preview on our pre-production supervision system.
We noticed that in the logs of our topology to logs persistency in Hbase, we got the following exceptions (about 4 times in a 48 hours period):
java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1442)
at java.util.HashMap$KeyIterator.next(HashMap.java:1466)
at org.apache.storm.kafka.spout.KafkaSpout.doSeekRetriableTopicPartitions(KafkaSpout.java:347)
at org.apache.storm.kafka.spout.KafkaSpout.pollKafkaBroker(KafkaSpout.java:320)
at org.apache.storm.kafka.spout.KafkaSpout.nextTuple(KafkaSpout.java:245)
at org.apache.storm.daemon.executor$fn_4963$fn4978$fn_5009.invoke(executor.clj:647)
at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:484)
at clojure.lang.AFn.run(AFn.java:22)
at java.lang.Thread.run(Thread.java:748)
It looks like there's something to fix here, such as making the map thread-safe, or managing the exclusivity of modification of this map at a caller level.
Note: this topology is using Storm Kafka Client spout with default properties (unlike other topologies we have based on autocommit). However, it's the one which deals with highest rate of messages (line of logs coming from about 10000 VMs, a nice scale test for Storm )
Could it be fixed in Storm 1.2.0 final version?
Best regards,
Alexandre Vermeerbergen