Uploaded image for project: 'Qpid Proton'
  1. Qpid Proton
  2. PROTON-2483

TSAN reported potential deadlock in epoll proactor when run via Qpid Dispatch router.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • proton-c-0.36.0
    • proton-c-0.37.0
    • proton-c
    • None
    • linux epoll

    Description

      The traces are incomplete but the 4 way thread tangle can be inferred as follows:

        A: pn_proactor_set_timeout()   (p->task.mutex + tm->task.mutex)
        B: pni_timer_manager_process() (tm->task.mutex + tm->deletion_mutex)
        C: pni_connection_timeout()    (tm->deletion_mutex + pc1->task.mutex)
        D: proactor_remove()           (pc1->task.mutex + p->task.mutex)

      While this particular trace is a false positive (D occurs after all other threads have been joined and there are no competing threads to complete the circle), the lock ordering is clearly asking for eventual trouble.

      The proactor set_timeout and cancel_timeout API calls do not need to hold the proactor task lock while interacting with the timer manager, but do so as a convenience to prevent collisions between simultaneous sets/cancels.  A separate lock can achieve that purpose, stopping A from participating in the potential deadlock.

       

      Attachments

        1. tsan_out.txt
          5 kB
          Clifford Jansen

        Activity

          People

            cliffjansen Clifford Jansen
            cliffjansen Clifford Jansen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: