Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: c
    • Labels:
      None

      Description

      In one of my projects that uses Avro, I pass avro_datum_t instances between threads, using the reference count mechanism to make sure that they're not freed while any thread still has a reference to them. I was getting some spurious segfaults, which were caused by the fact that the reference counts aren't updated atomically. I've created a patch that implements atomic reference counts, using the OpenPA library to provide the atomic operations themselves. (That library is MIT licensed, so it can be included in the source tree.)

      Note that only avro_XXX_incref and avro_XXX_decref are thread-safe as a result of this patch. For all of the other library functions, the caller is still responsible for ensuring thread safety.

      The patch makes sure that the OpenPA code works in both the old autotools build and the newer CMake build.

      1. 0002-Serialization-performance-test.patch
        4 kB
        Douglas Creager
      2. 0001-Performance-test-program.patch
        5 kB
        Douglas Creager
      3. 0001-Atomic-reference-counts.patch
        13 kB
        Douglas Creager

        Activity

        Hide
        Douglas Creager added a comment -

        Ignore the 0001 patch; the new openpa directory wasn't being included in the distribution tarball, giving you compilation errors. The new patch fixes this.

        Show
        Douglas Creager added a comment - Ignore the 0001 patch; the new openpa directory wasn't being included in the distribution tarball, giving you compilation errors. The new patch fixes this.
        Hide
        Douglas Creager added a comment -

        Third time's a charm.

        Show
        Douglas Creager added a comment - Third time's a charm.
        Hide
        Bruce Mitchener added a comment -

        I'd like to see more / wider discussion of this.

        1) Portability: Will this work on Windows as set up now via CMake?
        2) New external code...
        3) Performance hit in single-threaded case?
        4) Portability: iOS / Android impact?

        Also, the CMake integration here doesn't build any code, it just uses the headers, I'm guessing that's okay?

        And this piles on top of some other stuff that uses C99 struct initializers ... I have another patch that I've been working on that gets rid of those because they don't work in Visual Studio which will now conflict with this.

        Show
        Bruce Mitchener added a comment - I'd like to see more / wider discussion of this. 1) Portability: Will this work on Windows as set up now via CMake? 2) New external code... 3) Performance hit in single-threaded case? 4) Portability: iOS / Android impact? Also, the CMake integration here doesn't build any code, it just uses the headers, I'm guessing that's okay? And this piles on top of some other stuff that uses C99 struct initializers ... I have another patch that I've been working on that gets rid of those because they don't work in Visual Studio which will now conflict with this.
        Hide
        Douglas Creager added a comment -

        I'm not married to the idea of incorporating OpenPA, it just seemed like an expedient solution. We could easily create our own refcount.h header, that contained static inline definitions like OpenPA's does, to ensure that we can target all of the platforms you mention.

        I'll try to run some tests to compare single-threaded performance. I don't think it's a big hit, but some data to back that up would be nice.

        Show
        Douglas Creager added a comment - I'm not married to the idea of incorporating OpenPA, it just seemed like an expedient solution. We could easily create our own refcount.h header, that contained static inline definitions like OpenPA's does, to ensure that we can target all of the platforms you mention. I'll try to run some tests to compare single-threaded performance. I don't think it's a big hit, but some data to back that up would be nice.
        Hide
        Douglas Creager added a comment -

        Here's a patch that does a performance test of the reference counting mechanism. It allocates a single int value, and then references and dereferences that value 100,000,000 times. It runs the test 3 times and prints out the average time across the 3 runs.

        It turns out that the atomic counts are a bigger performance hit than I thought. With the original implementation, which just uses C's ++ and – operators, the test case ran in 0.460 seconds. Using the new atomic operation, it took 1.820 seconds.

        That said, I don't know if reference counting is the bottleneck in a "typical" use of the library. I'm going to add some more performance tests to the test program to see what we get.

        Show
        Douglas Creager added a comment - Here's a patch that does a performance test of the reference counting mechanism. It allocates a single int value, and then references and dereferences that value 100,000,000 times. It runs the test 3 times and prints out the average time across the 3 runs. It turns out that the atomic counts are a bigger performance hit than I thought. With the original implementation, which just uses C's ++ and – operators, the test case ran in 0.460 seconds. Using the new atomic operation, it took 1.820 seconds. That said, I don't know if reference counting is the bottleneck in a "typical" use of the library. I'm going to add some more performance tests to the test program to see what we get.
        Hide
        Douglas Creager added a comment -

        Here's another performance test. It basically reproduces the write/read test from test_avro_data, but does it 100,000 times and times it.

        In this case, the reference counting isn't as much of a bottleneck. I'm averaging about 0.85 seconds per run with the original non-atomic reference counts, and about 0.87 seconds with atomic reference counts.

        Show
        Douglas Creager added a comment - Here's another performance test. It basically reproduces the write/read test from test_avro_data, but does it 100,000 times and times it. In this case, the reference counting isn't as much of a bottleneck. I'm averaging about 0.85 seconds per run with the original non-atomic reference counts, and about 0.87 seconds with atomic reference counts.
        Hide
        Scott Carey added a comment -

        That sounds about right. Atomic increments are a lot more expensive than a simple increment, since it has to push the value to the processor cache and notify any other processors that may have copies of the cache line. And because an ordinary increment is basically the cheapest thing you can ask a CPU to do. However, such increments are rarely a big chunk of the total work.

        Very heavy multi-threaded access will likely show a bigger performance hit.

        Show
        Scott Carey added a comment - That sounds about right. Atomic increments are a lot more expensive than a simple increment, since it has to push the value to the processor cache and notify any other processors that may have copies of the cache line. And because an ordinary increment is basically the cheapest thing you can ask a CPU to do. However, such increments are rarely a big chunk of the total work. Very heavy multi-threaded access will likely show a bigger performance hit.
        Hide
        Douglas Creager added a comment -

        Here's a new version of the patch that doesn't include the full OpenPA distribution. There's a single header file, src/refcount.h, which defines a bunch of static inline functions.

        I think this patch takes care of all of the platforms that Bruce mentioned. There are implementations using Mac OS X functions (which should also include iOS), GCC intrinsics (valid for any platform with GCC >= 4.1), raw assembly in GCC for i386, x86_64, and ppc, and Windows intrinsics. I don't have a Windows machine to test the Windows intrinsics on; Bruce, would you be able to test that?

        Show
        Douglas Creager added a comment - Here's a new version of the patch that doesn't include the full OpenPA distribution. There's a single header file, src/refcount.h, which defines a bunch of static inline functions. I think this patch takes care of all of the platforms that Bruce mentioned. There are implementations using Mac OS X functions (which should also include iOS), GCC intrinsics (valid for any platform with GCC >= 4.1), raw assembly in GCC for i386, x86_64, and ppc, and Windows intrinsics. I don't have a Windows machine to test the Windows intrinsics on; Bruce, would you be able to test that?
        Hide
        Douglas Creager added a comment -

        There hasn't been any word on whether the Windows intrinsics work, but this is working on all of the other platforms I can test. And we still don't officially list Windows as a supported port; AVRO-551 is still open to get the tests to compile and run under Windows. Unless I hear objections in the next couple of days, I'm going to go ahead and commit this on the 1.6 trunk branch.

        Show
        Douglas Creager added a comment - There hasn't been any word on whether the Windows intrinsics work, but this is working on all of the other platforms I can test. And we still don't officially list Windows as a supported port; AVRO-551 is still open to get the tests to compile and run under Windows. Unless I hear objections in the next couple of days, I'm going to go ahead and commit this on the 1.6 trunk branch.
        Hide
        Douglas Creager added a comment -

        Committed to SVN for 1.6.0; not back-ported to 1.5 branch

        Show
        Douglas Creager added a comment - Committed to SVN for 1.6.0; not back-ported to 1.5 branch
        Hide
        Doug Cutting added a comment -

        Can you please update the top-level CHANGES.txt too? Thanks!

        Show
        Doug Cutting added a comment - Can you please update the top-level CHANGES.txt too? Thanks!
        Hide
        Douglas Creager added a comment -

        Apologies! CHANGES.txt now updated.

        Show
        Douglas Creager added a comment - Apologies! CHANGES.txt now updated.

          People

          • Assignee:
            Douglas Creager
            Reporter:
            Douglas Creager
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development