[STORM-2231] NULL in DisruptorQueue while multi-threaded ack - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 1.0.1, 1.1.0
Fix Version/s: 2.0.0, 1.2.0, 1.1.2, 1.0.5
Component/s: storm-core
Labels:
None

Description

I use simple topology with one spout (9 workers) and one bolt (9 workers).
I have topology.backpressure.enable: false in storm.yaml.
Spouts send about 10 000 000 tuples in 10 minutes. Pending for spout is 80 000.
Bolts buffer theirs tuples for 60 seconds and flush to database and ack tuples in parallel (10 threads).
I read that OutputCollector can be used in many threads safely, so i use it.
I don't have any bottleneck in bolts(flushing to database) or spouts(kafka spout), but about 2% of tuples fail due to tuple processing timeout (fails are recordered in spout stats only).
I am sure that bolts ack all tuples. But some of acks don't come to spouts.

While multi-threaded acking i see many errors in worker logs like that:
2016-12-01 13:21:10.741 o.a.s.u.DisruptorQueue [ERROR] NULL found in disruptor-executor[3 3]-send-queue:853877

I tried to use synchronized wrapper around OutputCollector to fix the error. But it didn't help.

I found the workaround that helps me: i do all processing in bolt in multiple threads but call OutputCollector.ack methods in a one single separate thread.

I think Storm has an error in the multi-threaded use of OutputCollector.

If my topology has much less load, like 500 000 tuples per 10 minutes, then i don't lose any acks.

Attachments

Issue Links

links to

GitHub Pull Request #2293

Activity

People

Assignee:: Jungtaek Lim

Reporter:: Alexander Kharitonov

Votes:: 1 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 02/Dec/16 12:51

Updated:: 28/Aug/17 20:05

Resolved:: 26/Aug/17 12:34

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

1h 20m