Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9611

Hang in HandoffToProbesAndWait() for multithreaded join build

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 3.4.0
    • Fix Version/s: Impala 4.0.0
    • Component/s: Backend
    • Labels:

      Description

      I saw a hang triggered by test_failpoints in JoinBuilder::HandofftoProbesAndWait(), where the thread was blocked but build_side_state->is_cancelled_ is true.

      The sequence of events leading to the bug is as follows:

      • Thread A is in HandoffToProbesAndWait(), reads is_cancelled_ and sees false.
      • Thread B in RuntimeState::Cancel() sets is_cancelled_ = true, acquires cancellation_cvs_lock_, then calls NotifyAll() on the condition variable
      • Thread A calls Wait() on the cv, blocks forever.

      I think this is most likely if thread A is de-scheduled at the wrong time.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tarmstrong Tim Armstrong
                Reporter:
                tarmstrong Tim Armstrong
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: