Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35260 DataSourceV2 Function Catalog implementation
  3. SPARK-36695

Allow passing V2 functions to data sources via V2 filters

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.3.0
    • None
    • SQL
    • None

    Description

      The V2 filter API currently only allow NamedReference in predicates that are pushed down to data sources. It may be beneficial to allow V2 functions in predicates as well so that we can implement function pushdown. This feature is also supported by Trino (Presto).

      One use case is we can pushdown predicates such as bucket(col, 32) = 10 which will allow data sources such as Iceberg to only scan a single partition.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            csun Chao Sun
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment