Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28548

explain() shows wrong result for persisted DataFrames after some operations

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0
    • None
    • SQL
    • None

    Description

      After some operations against Datasets and then persist them, Dataset.explain shows wrong result.
      One of those operations is explain() itself.
      An example here.

      val df = spark.range(10)
      df.explain
      df.persist
      df.explain
      

      Expected result is like as follows.

      == Physical Plan ==
      *(1) ColumnarToRow
      +- InMemoryTableScan [id#7L]
            +- InMemoryRelation [id#7L], StorageLevel(disk, memory, deserialized, 1 replicas)
                  +- *(1) Range (0, 10, step=1, splits=12)
      

      But I got this.

      == Physical Plan ==
      *(1) Range (0, 10, step=1, splits=12)
      

      Attachments

        Issue Links

          Activity

            People

              sarutak Kousuke Saruta
              sarutak Kousuke Saruta
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: