XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0
    • Component/s: SQL
    • Labels:
    • Sprint:
      Spark 1.5 doc/QA sprint

      Description

      Here's a proposal for supporting window functions in the DataFrame DSL:

      1. Add an over function to Column:

      class Column {
        ...
        def over(window: Window): Column
        ...
      }
      

      2. Window:

      object Window {
        def partitionBy(...): Window
        def orderBy(...): Window
      
        object Frame {
          def unbounded: Frame
          def preceding(n: Long): Frame
          def following(n: Long): Frame
        }
      
        class Frame
      }
      
      class Window {
        def orderBy(...): Window
        def rowsBetween(Frame, Frame): Window
        def rangeBetween(Frame, Frame): Window  // maybe add this later
      }
      

      Here's an example to use it:

      df.select(
        avg(“age”).over(Window.partitionBy(“..”, “..”).orderBy(“..”, “..”)
          .rowsBetween(Frame.unbounded, Frame.currentRow))
      )
      
      df.select(
        avg(“age”).over(Window.partitionBy(“..”, “..”).orderBy(“..”, “..”)
          .rowsBetween(Frame.preceding(50), Frame.following(10)))
      )
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                chenghao Cheng Hao
                Reporter:
                rxin Reynold Xin
              • Votes:
                2 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: