Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22637

CatalogImpl.refresh() has quadratic complexity for a view

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersStop watchingWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.2.0
    • 2.2.2, 2.3.0
    • SQL
    • None

    Description

      org.apache.spark.sql.internal.CatalogImpl.refreshTable uses foreach(..) to refresh all tables in a view. This traverses all nodes in the subtree and calls LogicalPlan.refresh() on these nodes. However LogicalPlan.refresh() is also refreshing its children, as a result refreshing a large view can be quite expensive.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            hvanhovell Herman van Hövell Assign to me
            hvanhovell Herman van Hövell
            Votes:
            0 Vote for this issue
            Watchers:
            2 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment