Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-9198

BeamSQL aggregation analytics functionality

Details

    • New Feature
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • dsl-sql

    Description

      Mentor email: ruwang@google.com. Feel free to send emails for your questions.

      Project Information
      ---------------------
      BeamSQL has a long list of of aggregation/aggregation analytics functionalities to support.

      To begin with, you will need to support this syntax:

      analytic_function_name ( [ argument_list ] )
        OVER (
          [ PARTITION BY partition_expression_list ]
          [ ORDER BY expression [{ ASC | DESC }] [, ...] ]
          [ window_frame_clause ]
        )
      

      As there is a long list of analytics functions, a good start point is support rank() first.

      This will requires touch core components of BeamSQL:
      1. SQL parser to support the syntax above.
      2. SQL core to implement physical relational operator.
      3. Distributed algorithms to implement a list of functions in a distributed manner.
      4. Enable in ZetaSQL dialect.

      To understand what SQL analytics functionality is, you could check this great explanation doc: https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts.

      To know about Beam's programming model, check: https://beam.apache.org/documentation/programming-guide/#overview

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              amaliujia Rui Wang
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10h
                  10h