Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25822

Fix a race condition when releasing a Python worker

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.2
    • Fix Version/s: 2.3.3, 2.4.0
    • Component/s: PySpark
    • Labels:
      None

      Description

      There is a race condition when releasing a Python worker. If "ReaderIterator.handleEndOfDataSection" is not running in the task thread, when a task is early terminated (such as "take(N)"), the task completion listener may close the worker but "handleEndOfDataSection" can still put the worker into the worker pool to reuse.

      https://github.com/zsxwing/spark/commit/0e07b483d2e7c68f3b5c3c118d0bf58c501041b7 is a patch to reproduce this issue.

      I also found a user reported this in the mail list: http://mail-archives.apache.org/mod_mbox/spark-user/201610.mbox/%3CCAAUq=H+YLUEpd23nwvq13Ms5hOStkhX3ao4f4zQV6sgO5zM-xA@mail.gmail.com%3E

        Attachments

          Activity

            People

            • Assignee:
              zsxwing Shixiong Zhu
              Reporter:
              zsxwing Shixiong Zhu
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: