Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-1837

Running local clusters without simulating time breaks Testing.completeTopology, and may cause message loss

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.0.0, 2.0.0, 1.0.1
    • Fix Version/s: 2.0.0, 1.1.0
    • Component/s: None
    • Labels:
      None

      Description

      Since https://github.com/apache/storm/pull/810 it is no longer possible to call Testing.completeTopology when time is not simulating, because a call to advance-cluster-time is made from the function, which calls Time/advanceTime. advance-cluster-time should only be called if time is simulating.

      Since https://github.com/apache/storm/pull/830 a local cluster run without time simulation may lose messages. When a worker emits messages for a worker that hasn't started yet, the message is lost. This can happen because spouts may start emitting before all workers have started, when time simulation is disabled. Local clusters usually run without message timeouts, so this will make tests relying on Testing.withLocalCluster flaky.

      The problem is that there are no longer any queues to store messages for workers that haven't started yet. See https://github.com/apache/storm/pull/830/files#diff-c6ff4208ef84c7a5a1a6b8b6bd1f7d19R104. A queue should be added for messages for workers that haven't registered a receive callback yet.

        Attachments

          Activity

            People

            • Assignee:
              Srdo Stig Rohde Døssing
              Reporter:
              Srdo Stig Rohde Døssing
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 2h
                2h