Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.1.0
-
None
-
None
Description
This is a follow-up JIRA for: https://issues.apache.org/jira/browse/SPARK-30367
We should add a "number of computed rows" metric to InMemoryRelation. This will show the user how many rows were computed using the InMemoryRelation's cached plan (e.g. possibly zero rows if no data had to be computed, the same amount as total rows read if all rows had to be computed, some subset of the total rows read if some partitions had to be recomputed, etc) which would help with determining how much work was done for this part of the query.
An example with the metric where the InMemoryRelation's data was fully computed from its plan: