Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27207

There exists a bug with SortBasedAggregator where merge()/update() operations get invoked on the aggregate buffer without calling initialize

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • 3.0.0
    • None
    • SQL
    • None

    Description

      Normally, the aggregate operations that are invoked for an aggregation buffer for User Defined Aggregate Functions(UDAF) follow the order like initialize(), update(), eval() OR initialize(), merge(), eval(). However, after a certain threshold configurable by spark.sql.objectHashAggregate.sortBased.fallbackThreshold is reached, ObjectHashAggregate falls back to SortBasedAggregator which invokes the merge or update operation without calling initialize() on the aggregate buffer.

      Attachments

        Issue Links

          Activity

            People

              pgandhi Parth Gandhi
              pgandhi Parth Gandhi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: