Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30374 Feature Parity between PostgreSQL and Spark (ANSI/SQL)
  3. SPARK-32934

Improve the performance for NTH_VALUE and Reactor the OffsetWindowFunction

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.1.0
    • SQL
    • None

    Description

      Spark SQL support some window function like NTH_VALUE
      If we specify window frame like

      UNBOUNDED PRECEDING AND CURRENT ROW
      

      or

      UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
      

      We can elimate some calculations.
      For example: if we execute the SQL show below:

      SELECT NTH_VALUE(col,
               2) OVER(ORDER BY rank UNBOUNDED PRECEDING
              AND CURRENT ROW)
      FROM tab;
      

      The output for row number greater than 1, return the fixed value. otherwise, return null. So we just calculate the value once and notice whether the row number less than 2.
      UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is simpler.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            beliefer jiaan.geng
            beliefer jiaan.geng
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment