Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23802

PropagateEmptyRelation can leave query plan in unresolved state

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.3.0
    • 2.3.1, 2.4.0
    • SQL
    • None

    Description

      Since https://github.com/apache/spark/pull/19825 PropagateEmptyRelation has been taught to handle more cases it can cause the optimized query plan to be unresolved.

      Simple repro is to run following through the optimizer

      LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1))) .join(LocalRelation('a.int, 'b.int), UsingJoin(FullOuter, "a" :: Nil), None)

      Which results in

      Project [coalesce(a#0, null) AS a#7, null AS b#6]
      +- LocalRelation [a#0]

      This then fails type check on coalesce expression since `a` and null have different type.

       

      Simple, targeted fix is to change PropagateEmptyRelation to add casts around nulls. More comprehensive fix would be to run type coercion at the end of optimization so it can fix cases like those. Alternatively the type checking code could treat NullType as equal to any other type and not fail the type check in the first place.

      Attachments

        Activity

          People

            robert3005 Robert Kruszewski
            robert3005 Robert Kruszewski
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: