Uploaded image for project: 'Qpid'
  1. Qpid
  2. QPID-2341

Annotate replicated broker classes with assertions.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.6
    • 0.8
    • C++ Broker
    • None

    Description

      A clustered broker maintains consistency of replicated objects by only modifying them in a "replication safe" thread context: while receiving an update or dispatching cluster events.

      A repeated source of cluster bugs is broker code that unwittingly modifies replicated objects in an unsafe context such as a timer thread. These bugs are intermittent race conditions that are hard to track down.

      Proposal: annotate broker code with assertions to identify code that modifies replicated state and log/abort if such code is called in an unsafe context:

      // New class:
      namespace broker {
      class Replicated {
      protected:
      void assertReplicationSafe();
      }

      // Existing classes
      class Queue : public Replicated { // Mark Queue as state that may be replicated.
      void someQueueModifier()

      { assertReplicationSafe(); // This function should only be called in replication-safe context. }

      The assertion is cheap: just testing a thread-local boolean value. In a non-clustered broker it does nothing.

      This technique has already proven valuable in debugging a recent bug, putting the assertions permanently in the code should speed debugging of future bugs.

      This would be the beginning of a formal contract between the broker code and the cluster that should make things more maintainable in the long run.

      Attachments

        1. cluster_safe.patch
          17 kB
          Alan Conway

        Activity

          People

            aconway Alan Conway
            aconway Alan Conway
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: