Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26221 Improve Spark SQL instrumentation and metrics
  3. SPARK-26327

Metrics in FileSourceScanExec not update correctly while relation.partitionSchema is set

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersStop watchingWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete CommentsDelete
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.0
    • 2.2.3, 2.3.3, 2.4.1, 3.0.0
    • SQL
    • None

    Description

      As currently approach in `FileSourceScanExec`, the metrics of "numFiles" and "metadataTime"(fileListingTime) were updated while lazy val `selectedPartitions` initialized in the scenario of relation.partitionSchema is set. But `selectedPartitions` will be initialized by `metadata` at first, which is called by `queryExecution.toString` in `SQLExecution.withNewExecutionId`. So while the `SQLMetrics.postDriverMetricUpdates` called, there's no corresponding liveExecutions in SQLAppStatusListener, the metrics update is not work.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            XuanYuan Yuanjian Li Assign to me
            XuanYuan Yuanjian Li
            Votes:
            0 Vote for this issue
            Watchers:
            4 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment