Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16714

Fix org.apache.cassandra.utils.SlidingTimeRateTest.testConcurrentUpdateAndGet

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • None
    • Test/unit
    • None
    • Degradation
    • Normal
    • Normal
    • Unit Test
    • All
    • None
    • Hide

      SlidingTimeRateTest fails if the executor runs with a higher thread count, and the run is lucky enough to get a near-uniform work distribution.

      I was able to reproduce the issue by decreasing updates to 10000 and adding a sleep statement after rate.update(1):

                  executor.submit(() -> {
                      threadCnt.computeIfAbsent(Thread.currentThread().getId(), (n) -> new AtomicInteger())
                               .incrementAndGet();
                      time.sleep(1, TimeUnit.MILLISECONDS);
                      rate.update(1);
                      try
                      {
                          Thread.sleep(2);
                      }
                      catch (InterruptedException ie) {}
                  });
      

      In the snippet above I've also added a per-thread increment to measure work distribution. The results are:

      Without sleep (test passes):

      {16=1752, 17=1441, 18=420, 19=1434, 20=468, 21=1259, 22=582, 23=568, 24=670, 13=387, 14=508, 15=511}
      

      With sleep (test fails):

      {16=833, 17=834, 18=835, 19=833, 20=833, 21=833, 22=833, 23=833, 24=834, 13=833, 14=833, 15=833}
      

      As a result of the uniform work distribution, concurrent updates hit same timestamp more frequently which in turn makes the test run quicker in TestTimeSource terms with more hits per timestamp.

      I suggest to set thread pool size to, say, 4 as well as to increase tolerated delta to 150. The former will limit the impact of the work distribution effect while the latter sets more realistic boundaries on the possible values (patch).

      Show
      SlidingTimeRateTest fails if the executor runs with a higher thread count, and the run is lucky enough to get a near-uniform work distribution. I was able to reproduce the issue by decreasing updates to 10000 and adding a sleep statement after rate.update(1) : executor.submit(() -> { threadCnt.computeIfAbsent(Thread.currentThread().getId(), (n) -> new AtomicInteger()) .incrementAndGet(); time.sleep(1, TimeUnit.MILLISECONDS); rate.update(1); try { Thread.sleep(2); } catch (InterruptedException ie) {} }); In the snippet above I've also added a per-thread increment to measure work distribution. The results are: Without sleep (test passes): {16=1752, 17=1441, 18=420, 19=1434, 20=468, 21=1259, 22=582, 23=568, 24=670, 13=387, 14=508, 15=511} With sleep (test fails): {16=833, 17=834, 18=835, 19=833, 20=833, 21=833, 22=833, 23=833, 24=834, 13=833, 14=833, 15=833} As a result of the uniform work distribution, concurrent updates hit same timestamp more frequently which in turn makes the test run quicker in TestTimeSource terms with more hits per timestamp. I suggest to set thread pool size to, say, 4 as well as to increase tolerated delta to 150. The former will limit the impact of the work distribution effect while the latter sets more realistic boundaries on the possible values ( patch ).

    Description

      Fix org.apache.cassandra.utils.SlidingTimeRateTest.testConcurrentUpdateAndGet in Cassandra 3.11 

      https://jenkins-cm4.apache.org/job/Cassandra-3.11/174/testReport/junit/org.apache.cassandra.utils/SlidingTimeRateTest/testConcurrentUpdateAndGet_cdc/

      We should also propagate the fix to 4.0 where the utility class and the tests also exist but they are not currently in use so to remove the noise the tests are currently skipped from running at the moment. For reference - CASSANDRA-16713

       

      Attachments

        Issue Links

          Activity

            People

              Gerrrr Alex Sorokoumov
              e.dimitrova Ekaterina Dimitrova
              Alex Sorokoumov
              Ekaterina Dimitrova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: