Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
2.1.0
Description
We can demonstrate the problem with the following data set and query:
val spark = SparkSession.builder().appName("test").master("local").getOrCreate() val sql1 = """ |create temporary view t1 as select * from values |(1) |as grouping(a) """.stripMargin val sql2 = """ |create temporary view t2 as select * from values |(1) |as grouping(a) """.stripMargin val sql3 = """ |create temporary view t3 as select * from values |(1), |(1) |as grouping(a) """.stripMargin val sql4 = """ |create temporary view t4 as select * from values |(1), |(1) |as grouping(a) """.stripMargin val sqlA = """ |create temporary view ta as |select a, 'a' as tag from t1 union all |select a, 'b' as tag from t2 """.stripMargin val sqlB = """ |create temporary view tb as |select a, 'a' as tag from t3 union all |select a, 'b' as tag from t4 """.stripMargin val sql = """ |select tb.* from ta inner join tb on |ta.a = tb.a and |ta.tag = tb.tag """.stripMargin spark.sql(sql1) spark.sql(sql2) spark.sql(sql3) spark.sql(sql4) spark.sql(sqlA) spark.sql(sqlB) spark.sql(sql).show()
The results which is incorrect:
+---+---+ | a|tag| +---+---+ | 1| b| | 1| b| | 1| a| | 1| a| | 1| b| | 1| b| | 1| a| | 1| a| +---+---+
The correct results should be:
+---+---+ | a|tag| +---+---+ | 1| a| | 1| a| | 1| b| | 1| b| +---+---+