Wrapping TuplImpl.values with Collections.unmodifiableList() turns out be very expensive. Its intention is obviously to check and prevent accidental tweaking TuplImpl once created. Given the high cost, if needed, we can limit this extra checking mechanism in debug/dev mode. Being in the critical path it means several thousand/million additional allocations per second of the List wrapper object .... proportional to the number of bolt/spout instances.
|throughput (k/sec)||cores||mem (mb)|
bin/storm jar topos/storm-loadgen-2.0.0-SNAPSHOT.jar org.apache.storm.loadgen.ThroughputVsLatency --rate 550000 --spouts 1 --splitters 3 --counters 2 -c topology.acker.executors=0
|storm-3205||5.4 mill/sec (+27%)|
: bin/storm jar topos/storm-perf-2.0.0-SNAPSHOT.jar org.apache.storm.perf.ConstSpoutIdBoltNullBoltTopo -c topology.acker.executors=0 -c topology.producer.batch.size=1000 400
Note: The perf gains are more evident when operating at high thoughputs w/o backpressure occurring (i.e. some bolts have not yet become a bottleneck)