Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38529

Prevent GeneratorNestedColumnAliasing to be applied to non-Explode generators

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.2.1
    • 3.3.0
    • Optimizer
    • None

    Description

      The Project(_, g: Generate) branch in GeneratorNestedColumnAliasing is only supposed to work for ExplodeBase generators but we do not explicitly return for other types like Inline. Currently the bug is not trigger because there is another bug in the "prune unrequired child" branch in the ColumnPruning which makes other generators like Inline always go to that branch even if it is not applicable.

       

      An easy example to show the bug:

      Input: <col1: int, col2: array<struct<field1 struct<field1: int, field2: int>, field2 int>>>

      Project(field1.field1 as ...)

      • Generate(Inline(col2), ..., field1, field2)

       

      We will try to incorrectly push the .field1 on field1 into the input of the Inline (col2).

       

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            miny Min Yang
            miny Min Yang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified

                Slack

                  Issue deployment