Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26812

PushProjectionThroughUnion nullability issue

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.0
    • 2.4.4, 3.0.0
    • SQL

    Description

      Union output data types are the output data types of the first child.
      However the other union children may have different values nullability.
      This means that we can't always push down a project on the children.

      To reproduce

      Seq(Map("foo" -> "bar")).toDF("a").write.saveAsTable("table1")
      sql("SELECT 1 AS b").write.saveAsTable("table2")
      sql("CREATE OR REPLACE VIEW test1 AS SELECT map() AS a FROM table2 UNION ALL SELECT a FROM table1")
       sql("select * from test1").show
      

      This fails becaus the plan is no longer resolved.
      The plan is broken by the PushProjectionThroughUnion rule which pushed down a cast to map<string,string> with values nullability=true on a child with type map<string, string> with values nullability=false.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mgaido Marco Gaido
            bograd Bogdan Raducanu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment