Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22216 Improving PySpark/Pandas interoperability
  3. SPARK-24334

Race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory allocator

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.3.1, 2.4.0
    • Component/s: PySpark
    • Labels:
      None

      Description

      Currently, ArrowPythonRunner has two thread that frees the Arrow vector schema root and allocator - The main writer thread and task completion listener thread. 

      Having both thread doing the clean up leads to weird case (e.g., negative ref cnt, NPE, and memory leak exception) when an exceptions are thrown from the user function.

       

        Attachments

          Activity

            People

            • Assignee:
              icexelloss Li Jin
              Reporter:
              icexelloss Li Jin
            • Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: