Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19737

New analysis rule for reporting unregistered functions without relying on relation resolution

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersStop watchingWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments


    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • SQL
    • None


      Let's consider the following simple SQL query that reference an undefined function foo that is never registered in the function registry:

      SELECT foo(a) FROM t

      Assuming table t is a partitioned temporary view consisting of a large number of files stored on S3, it may take the analyzer a long time before realizing that foo is not registered yet.

      The reason is that the existing analysis rule ResolveFunctions requires all child expressions to be resolved first. Therefore, ResolveRelations has to be executed first to resolve all columns referenced by the unresolved function invocation. This further leads to partition discovery for t, which may take a long time.

      To address this case, we propose a new lightweight analysis rule LookupFunctions that

      1. Matches all unresolved function invocations
      2. Look up the function names from the function registry
      3. Report analysis error for any unregistered functions

      Since this rule doesn't actually try to resolve the unresolved functions, it doesn't rely on ResolveRelations and therefore doesn't trigger partition discovery.

      We may put this analysis rule in a separate Once rule batch that sits between the "Substitution" batch and the "Resolution" batch to avoid running it repeatedly and make sure it gets executed before ResolveRelations.


        Issue Links


          This comment will be Viewable by All Users Viewable by All Users


            lian cheng Cheng Lian Assign to me
            lian cheng Cheng Lian
            0 Vote for this issue
            5 Stop watching this issue




                Issue deployment