Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26771

Make .unpersist(), .destroy() consistently non-blocking by default

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 3.0.0
    • Component/s: GraphX, Spark Core
    • Labels:
    • Target Version/s:
    • Docs Text:
      Hide
      The RDD and DataFrame .unpersist() method, and Broadcast .destroy() method, take an optional 'blocking' argument. The default was 'false' in all cases except for (Scala) RDDs and their GraphX subclasses. The default is now 'false' (non-blocking) in all of these methods. Pyspark's RDD and Broadcast classes now have an optional 'blocking' argument as well, with the same behavior. Finally, internally, cached queries are also unpersisted without blocking now.
      Show
      The RDD and DataFrame .unpersist() method, and Broadcast .destroy() method, take an optional 'blocking' argument. The default was 'false' in all cases except for (Scala) RDDs and their GraphX subclasses. The default is now 'false' (non-blocking) in all of these methods. Pyspark's RDD and Broadcast classes now have an optional 'blocking' argument as well, with the same behavior. Finally, internally, cached queries are also unpersisted without blocking now.

      Description

      See https://issues.apache.org/jira/browse/SPARK-26728 and https://github.com/apache/spark/pull/23650 .

      RDD and DataFrame expose an .unpersist() method with optional "blocking" argument. So does Broadcast.destroy(). This argument is false by default except for the Scala RDD (not Pyspark) implementation and its GraphX subclasses. Most usages of these methods request non-blocking behavior already, and indeed, it's not typical to want to wait for the resources to be freed, except in tests asserting behavior about these methods (where blocking is typically requested).

      This proposes to make the default false across these methods, and adjust callers to only request non-default blocking behavior where important, such as in a few key tests.

        Attachments

          Activity

            People

            • Assignee:
              srowen Sean Owen
              Reporter:
              srowen Sean Owen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: