Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30368

Add computed rows metric to InMemoryRelation and show in SQL UI

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.1.0
    • None
    • SQL, Web UI
    • None

    Description

      This is a follow-up JIRA for: https://issues.apache.org/jira/browse/SPARK-30367

      We should add a "number of computed rows" metric to InMemoryRelation. This will show the user how many rows were computed using the InMemoryRelation's cached plan (e.g. possibly zero rows if no data had to be computed, the same amount as total rows read if all rows had to be computed, some subset of the total rows read if some partitions had to be recomputed, etc) which would help with determining how much work was done for this part of the query.

      An example with the metric where the InMemoryRelation's data was fully computed from its plan:

       

       

      Attachments

        1. w-metric.png
          78 kB
          Max Thompson

        Activity

          People

            Unassigned Unassigned
            maxthomp Max Thompson
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: