Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22216 Improving PySpark/Pandas interoperability
  3. SPARK-24334

Race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory allocator

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0
    • 2.3.1, 2.4.0
    • PySpark
    • None

    Description

      Currently, ArrowPythonRunner has two thread that frees the Arrow vector schema root and allocator - The main writer thread and task completion listener thread. 

      Having both thread doing the clean up leads to weird case (e.g., negative ref cnt, NPE, and memory leak exception) when an exceptions are thrown from the user function.

       

      Attachments

        Activity

          People

            icexelloss Li Jin
            icexelloss Li Jin
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: