Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-3422

TupleCaptureBolt is not thread-safe

    XMLWordPrintableJSON

    Details

      Description

      Marking this as Major because it's a crash. That said, the problem lies in testing code. This makes integration testing hard, but the issue does not affect any production code.

       

      First, let me show you a stack trace for Storm 2.0.0:

      java.lang.RuntimeException: java.lang.NullPointerException
      at org.apache.storm.executor.Executor.accept(Executor.java:282) ~[storm-client-2.0.0.jar:2.0.0]
      at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:133) ~[storm-client-2.0.0.jar:2.0.0]
      at org.apache.storm.utils.JCQueue.consume(JCQueue.java:110) ~[storm-client-2.0.0.jar:2.0.0]
      at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:171) ~[storm-client-2.0.0.jar:2.0.0]
      at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:158) ~[storm-client-2.0.0.jar:2.0.0]
      at org.apache.storm.utils.Utils$1.run(Utils.java:388) [storm-client-2.0.0.jar:2.0.0]
      at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
      Caused by: java.lang.NullPointerException
      at org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:45) ~[storm-client-2.0.0.jar:2.0.0]
      at org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:234) ~[storm-client-2.0.0.jar:2.0.0]
      at org.apache.storm.executor.Executor.accept(Executor.java:275) ~[storm-client-2.0.0.jar:2.0.0]
      ... 6 more

       

       Here's the same for Storm 1.2.2:

      java.lang.RuntimeException: java.lang.NullPointerException
      at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:522) ~[storm-core-1.2.2.jar:1.2.2]
      at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:487) ~[storm-core-1.2.2.jar:1.2.2]
      at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:74) ~[storm-core-1.2.2.jar:1.2.2]
      at org.apache.storm.daemon.executor$fn_10795$fn10808$fn_10861.invoke(executor.clj:861) ~[storm-core-1.2.2.jar:1.2.2]
      at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484) [storm-core-1.2.2.jar:1.2.2]
      at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
      at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
      Caused by: java.lang.NullPointerException
      at org.apache.storm.testing.TupleCaptureBolt.execute(TupleCaptureBolt.java:50) ~[storm-core-1.2.2.jar:1.2.2]
      at org.apache.storm.daemon.executor$fn_10795$tuple_action_fn_10797.invoke(executor.clj:739) ~[storm-core-1.2.2.jar:1.2.2]
      at org.apache.storm.daemon.executor$mk_task_receiver$fn__10716.invoke(executor.clj:468) ~[storm-core-1.2.2.jar:1.2.2]
      at org.apache.storm.disruptor$clojure_handler$reify__10135.onEvent(disruptor.clj:41) ~[storm-core-1.2.2.jar:1.2.2]
      at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:509) ~[storm-core-1.2.2.jar:1.2.2]
      ... 6 more

       

      This is a topology running as our integration test using Testing.completeTopology(). Both the stack traces point to the same code in the TupleCaptureBolt - its name field is not safely published (it should be marked final), and the internal HashMap does not safely store the data put in it. Perhaps it should be a ConcurrentHashMap?

      Would you accept a PR with a more detailed analysis, or are you going to investigate on your side?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                JanecekPetr Petr Janeček
                Reporter:
                JanecekPetr Petr Janeček
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 40m
                  1h 40m