Uploaded image for project: 'Thrift'
  1. Thrift
  2. THRIFT-1928

Race may still exist in TFileTransport::flush()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.9
    • 0.9.1
    • None
    • Ubuntu 10.04 server running on Virtualbox hosted on Ubuntu 12.04. Core i7 dual-core. (Dell Latitude E6530)

    Description

      I'm building Thrift 0.9.0 on Ubuntu 10.04 to upgrade our production system. Since my dev box is 12.04, I'm using Virtualbox. When running 'make check' I ran into a failure on TFileTransportTest taking too long. Investigation found this patch:
      https://issues.apache.org/jira/secure/attachment/12455704/ASF.LICENSE.NOT.GRANTED--v1-0034-THRIFT-926.-cpp-Fix-bugs-in-TFileTransport-flush.patch

      I think I might still be seeing this bug pop up.
      I thought at first it might be just that my machine was slow, it's a core i7 running Virtualbox so I can build for an older release we use in production.
      So I augmented TFileTransportTest.cpp to give it plenty of time and show how long each flush took:

      for (unsigned int n = 0; n < 10; ++n)

      { struct timeval preflush; gettimeofday(&preflush, NULL); transport.flush(); struct timeval now; gettimeofday(&now, NULL); int delta = time_diff(&preflush, &now); fprintf(stderr, "Flush took:%d microsecs\n",delta); // Fail if at any point we've been running for longer than half a second. // (With the buggy code, TFileTransport used to take 3 seconds per flush()) // // Use a fatal fail so we break out early, rather than continuing to make // many more slow flush() calls. delta = time_diff(&start, &now); BOOST_REQUIRE_LT(delta, 100000000); }

      Here's the output from the test run with '"make check":
      ...
      PASS: ZlibTest
      Flush took:75 microsecs
      Flush took:3000195 microsecs
      Flush took:3000405 microsecs
      Flush took:27 microsecs
      Flush took:16 microsecs
      Flush took:16 microsecs
      Flush took:16 microsecs
      Flush took:16 microsecs
      Flush took:13 microsecs
      Flush took:14 microsecs
      <TestLog><TestSuite name="TFileTransportTest"><TestCase name="test_destructor"><TestingTime>210000</TestingTime></TestCase><TestCase name="test_flush_max_us1"><TestingTime>20000</TestingTime></TestCase><TestCase name="test_flush_max_us2"><TestingTime>0</TestingTime></TestCase><TestCase name="test_flush_max_us3"><TestingTime>0</TestingTime></TestCase><TestCase name="test_noop_flush"><TestingTime>10000</TestingTime></TestCase></TestSuite></TestLog>

          • No errors detected
            PASS: TFileTransportTest

      Running the test more directly:
      ubuntu@U1004AMD64:~/thrift-0.9.0/lib/cpp/test$ ./TFileTransportTest
      Running 5 test cases...
      Flush took:70 microsecs
      Flush took:3000333 microsecs
      Flush took:3000376 microsecs
      Flush took:3000256 microsecs
      Flush took:3000348 microsecs
      Flush took:3000350 microsecs
      Flush took:3000278 microsecs
      Flush took:23 microsecs
      Flush took:16 microsecs
      Flush took:16 microsecs

      As you can see, multiple runs show similar output, where some flushes are very fast, and some flushes take just over 3 seconds. That makes me think there's still a race lurking somewhere.

      Attachments

        Issue Links

          Activity

            People

              jfarrell Jake Farrell
              rthille Robert P. Thille
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: