Details

    • Epic Color:
      ghx-label-7

      Description

      The BenchmarkTest.Basic test simply compares how many times a function that memcopies 16 bytes of data can be executed within a give time versus a function that memcopies 128 bytes of data in the same amount of given time. Recently it had a string of failures, meaning that the function copying more data took less time. The StopWatch class is used to do the timing. I'm assuming this test is basically just a unit test for the StopWatch class. According to a comment in the StopWatch class, it is inaccurate if context switching occurs. So I think the test needs to call getrusage() before, in-between and after taking the two measurements and take the number of context switches during each measurement into account to determine if it is fair to compare the two measurements before deciding whether to fail. The test code is in be/src/util/benchmark-test.cc. The test failure looks like this.

      11:07:28       Start 55: benchmark-test
      11:07:29 55/88 Test #55: benchmark-test ...................***Failed    0.54 sec
      11:07:29 Turning perftools heap leak checking off
      11:07:29 [==========] Running 1 test from 1 test case.
      11:07:29 [----------] Global test environment set-up.
      11:07:29 [----------] 1 test from BenchmarkTest
      11:07:29 [ RUN      ] BenchmarkTest.Basic
      11:07:29 Rate 16 Byte: 48295
      11:07:29 Rate 128 Byte: 72249.1
      11:07:29 /home/jenkins/Impala/be/src/util/benchmark-test.cc:67: Failure
      11:07:29 Expected: (rate_copy_128) < (rate_copy_16), actual: 72249.1 vs 48295
      11:07:29 [  FAILED  ] BenchmarkTest.Basic (128 ms)
      11:07:29 [----------] 1 test from BenchmarkTest (128 ms total)
      11:07:29 
      11:07:29 [----------] Global test environment tear-down
      11:07:29 [==========] 1 test from 1 test case ran. (129 ms total)
      11:07:29 [  PASSED  ] 0 tests.
      11:07:29 [  FAILED  ] 1 test, listed below:
      11:07:29 [  FAILED  ] BenchmarkTest.Basic
      11:07:29 
      11:07:29  1 FAILED TEST

        Activity

        Hide
        dhecht Dan Hecht added a comment -

        Matthew Mulder, was this with an ASAN enabled build?

        Show
        dhecht Dan Hecht added a comment - Matthew Mulder , was this with an ASAN enabled build?
        Hide
        mmulder Matthew Mulder added a comment -

        No.

        Show
        mmulder Matthew Mulder added a comment - No.
        Hide
        mjacobs Matthew Jacobs added a comment -

        raising the priority because this is showing up in recent builds and we need to purge flaky tests

        Show
        mjacobs Matthew Jacobs added a comment - raising the priority because this is showing up in recent builds and we need to purge flaky tests
        Hide
        mjacobs Matthew Jacobs added a comment -

        Zach Amsden can you look at this?

        Show
        mjacobs Matthew Jacobs added a comment - Zach Amsden can you look at this?
        Hide
        mjacobs Matthew Jacobs added a comment - - edited

        Maybe we should remove this test? I'm not sure if it's worth adding in intelligence to consider context switches? I don't know what coverage this provides that we think we really need.
        Dan Hecht what do you think?

        Show
        mjacobs Matthew Jacobs added a comment - - edited Maybe we should remove this test? I'm not sure if it's worth adding in intelligence to consider context switches? I don't know what coverage this provides that we think we really need. Dan Hecht what do you think?
        Hide
        zamsden Zach Amsden added a comment -

        This should be fixable with getrusage(). We can discard results when involuntary context switches goes too high as well to eliminate flake runs when there is CPU pressure on the machine.

        Show
        zamsden Zach Amsden added a comment - This should be fixable with getrusage(). We can discard results when involuntary context switches goes too high as well to eliminate flake runs when there is CPU pressure on the machine.
        Hide
        dhecht Dan Hecht added a comment -

        I don't have a strong feeling either way. If it's easy and reliable to use getrusage() that's fine. Or deleting the test is also fine.

        But I wonder why this is now failing. We run backend tests single threaded with nothing else, right? Maybe it's just another example of jenkins deploying multiple jobs to the same machine, or something?

        Show
        dhecht Dan Hecht added a comment - I don't have a strong feeling either way. If it's easy and reliable to use getrusage() that's fine. Or deleting the test is also fine. But I wonder why this is now failing. We run backend tests single threaded with nothing else, right? Maybe it's just another example of jenkins deploying multiple jobs to the same machine, or something?
        Hide
        mjacobs Matthew Jacobs added a comment -

        I didn't see any other jobs running at the same time. We saw it happen twice over the last 2 weeks so it seems unlikely to be that, especially since this is a fast test. I don't have a better theory though

        Show
        mjacobs Matthew Jacobs added a comment - I didn't see any other jobs running at the same time. We saw it happen twice over the last 2 weeks so it seems unlikely to be that, especially since this is a fast test. I don't have a better theory though
        Hide
        zamsden Zach Amsden added a comment -
        Show
        zamsden Zach Amsden added a comment - Fixed in https://gerrit.cloudera.org/#/c/6935/

          People

          • Assignee:
            zamsden Zach Amsden
            Reporter:
            mmulder Matthew Mulder
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development