Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-45929

support grouping set operation in dataframe api

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.1
    • 4.0.0
    • SQL

    Description

      I am using spark dataframe api for complex calculations. When I need to use the grouping sets function, I can only convert the expression to sql via analyzedPlan and then splice these sql into a complex sql to execute. In some cases, this operation generates an extremely complex sql. executing this complex sql, antlr4 continues to consume a large amount of memory, similar to a memory leak scenario. If you can and rollup, cube function through the dataframe api to calculate these operations will be much simpler.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            JacobZheng JacobZheng
            JacobZheng JacobZheng
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment