Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-10896

Support UNNEST an array of structs

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: P2
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.27.0
    • Component/s: dsl-sql-zetasql
    • Labels:
      None

      Description

      Currently UNNEST an array of structs does not work properly in Beam ZetaSQL:

       

      e.g. SELECT p.some_field FROM table, UNNEST(table.array_of_structs) AS p

       

      Execution of such queries will crash with error in ProjectScanConverter:

      Exception in thread "main" java.lang.AssertionError: Field ordinal 1 is invalid for  type 'RecordType(VARCHAR id)'Exception in thread "main" java.lang.AssertionError: Field ordinal 1 is invalid for  type 'RecordType(VARCHAR id)' at org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexBuilder.makeFieldAccess(RexBuilder.java:197) at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertResolvedStructFieldAccessInternal(ExpressionConverter.java:881) at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertResolvedStructFieldAccess(ExpressionConverter.java:871) at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr(ExpressionConverter.java:309) at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertResolvedStructFieldAccess(ExpressionConverter.java:869) at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr(ExpressionConverter.java:309) at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertResolvedStructFieldAccess(ExpressionConverter.java:869) at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr(ExpressionConverter.java:309) at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromComputedColumnWithFieldList(ExpressionConverter.java:368) at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.retrieveRexNode(ExpressionConverter.java:196) at org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert(ProjectScanConverter.java:45) at org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert(ProjectScanConverter.java:29) at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertNode(QueryStatementConverter.java:99)

       

      The root cause is that Calcite 1.20.0 Uncollect will "unwrap" the struct/row in an array being unnested: https://github.com/apache/calcite/blob/calcite-1.20.0/core/src/main/java/org/apache/calcite/rel/core/Uncollect.java#L146-L152

       

      The Calcite Uncollect API has a change in 1.23.0 that could provide us a way to bypass this "unwrapping": https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/core/Uncollect.java#L171-L172

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              robinyqiu Robin Qiu
              Reporter:
              robinyqiu Robin Qiu

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 10m
                1h 10m

                  Issue deployment