[STORM-1742] More accurate 'complete latency' - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.0.0, 1.1.0, 1.0.3
Component/s: storm-core
Labels:
None

Epic Link:
Release Apache Storm 1.1.0

Description

I already initiated talking thread on dev@ list. Below is copy of the content in my mail.
http://mail-archives.apache.org/mod_mbox/storm-dev/201604.mbox/%3CCAF5108gn=rSkUNdfs7-sgY_pD-_prgJ2hF2T5e5ZPpP-KnD-hg@mail.gmail.com%3E

While thinking about metrics improvements, I doubt how many users know that
what 'exactly' is complete latency. In fact, it's somewhat complicated
because additional waiting time could be added to complete latency because
of single-thread model event loop of spout.

Long running nextTuple() / ack() / fail() can affect complete latency but
it's behind the scene. No latency information provided, and someone even
didn't know about this characteristic. Moreover, calling nextTuple() could
be skipped due to max spout waiting, which will make us harder to guess
when avg. latency of nextTuple() will be provided.

I think separation of threads (tuple handler to separate thread, as JStorm
provides) would resolve the gap, but it requires our spout logic to be
thread-safe, so I'd like to find workaround first.

My sketched idea is let Acker decides end time for root tuple.
There're two subsequent ways to decide start time for root tuple,

1. when Spout about to emit ACK_INIT to Acker (in other words, keep it as
it is)

Acker sends ack / fail message to Spout with timestamp, and Spout
calculates time delta
pros. : It's most accurate way since it respects the definition of
'complete latency'.
cons. : The sync of machine time between machines are very important.
Sub-millisecond of precision would be required.

2. when Acker receives ACK_INIT from Spout

Acker calculates time delta itself, and sends ack / fail message to
Spout with time delta
pros. : No requirement to sync the time between servers so strictly.
cons. : It doesn't contain the latency to send / receive ACK_INIT
between Spout and Acker.

Attachments

Issue Links

links to

GitHub Pull Request #1379

GitHub Pull Request #1523

Activity

People

Assignee:: Jungtaek Lim

Reporter:: Jungtaek Lim

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 29/Apr/16 02:27

Updated:: 20/Jan/17 02:01

Resolved:: 08/Jul/16 11:08