Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5749

Race in coordinator hits DCHECK on 'num_remaining_backends_ > 0'

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.10.0
    • Fix Version/s: Impala 2.10.0
    • Component/s: Backend
    • Labels:
      None

      Description

      Discovered while running 'test_finst_cancel_when_query_complete' in a loop trying to repro a different issue, there's a race in Coordinator::UpdateBackendExecStatus that causes Impala to crash on the 'DCHECK_GT(num_remaining_backends_, 0)'

      The problem is that only the first exec report returned for a particular backend after it has completed is supposed to hit line 992, where we decrease 'num_remaining_backends_'. Per the comments, this is supposed to be ensured by the BackendState::IsDone check on line 945.

      However, the check and the update aren't performed atomically, so you can have a situation where two threads enter UpdateBackendExecStatus at the same time, both check BackendState::IsDone and find it false, and then both proceed to update num_remaining_backends_, with the second one hitting the DCHECK.

        Attachments

          Activity

            People

            • Assignee:
              twmarshall Thomas Tauber-Marshall
              Reporter:
              twmarshall Thomas Tauber-Marshall
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: