Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-3121

Fix flaky metrics tests in storm-core

    XMLWordPrintableJSON

    Details

      Description

      The tests are flaky, but only rarely fail. I've only seen them fail on Travis when Travis is under load.

      Example failures:

      classname: org.apache.storm.metrics-test / testname: test-custom-metric-with-multi-tasks
      expected: (clojure.core/= [1 0 0 0 0 0 2] (clojure.core/subvec (org.apache.storm.metrics-test/lookup-bucket-by-comp-id-&-metric-name! "2" "my-custom-metric") 0 N__3207__auto__))
        actual: (not (clojure.core/= [1 0 0 0 0 0 2] [1 0 0 0 0 0 0]))
            at: test_runner.clj:105
      
      classname: org.apache.storm.metrics-test / testname: test-builtin-metrics-2
      expected: (clojure.core/= [1 1] (clojure.core/subvec (org.apache.storm.metrics-test/lookup-bucket-by-comp-id-&-metric-name! "myspout" "__emit-count/default") 0 N__3207__auto__))
        actual: (not (clojure.core/= [1 1] [1 0]))
            at: test_runner.clj:105
      

      The problem is that the tests increment metrics counters in the executor async loops, then expect the counters to end up in exact metrics buckets. The creation of a bucket is triggered by the metrics timer. The timer is included in time simulation and LocalCluster.waitForIdle, but the executor async loop isn't. There isn't any guarantee that the executor async loop gets to run when the test does a sequence like

      Time.advanceClusterTime
      cluster.waitForIdle
      

      because the waitForIdle check doesn't know about the executor async loop.

        Attachments

          Activity

            People

            • Assignee:
              Srdo Stig Rohde Døssing
              Reporter:
              Srdo Stig Rohde Døssing
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3h 10m
                3h 10m