Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26221 Improve Spark SQL instrumentation and metrics
  3. SPARK-26327

Metrics in FileSourceScanExec not update correctly while relation.partitionSchema is set

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotVotersWatch issueWatchersConvert to IssueLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 2.2.3, 2.3.3, 2.4.1, 3.0.0
    • Component/s: SQL
    • Labels:
      None
    • Target Version/s:

      Description

      As currently approach in `FileSourceScanExec`, the metrics of "numFiles" and "metadataTime"(fileListingTime) were updated while lazy val `selectedPartitions` initialized in the scenario of relation.partitionSchema is set. But `selectedPartitions` will be initialized by `metadata` at first, which is called by `queryExecution.toString` in `SQLExecution.withNewExecutionId`. So while the `SQLMetrics.postDriverMetricUpdates` called, there's no corresponding liveExecutions in SQLAppStatusListener, the metrics update is not work.

        Attachments

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment