Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
spark currently cannot handle queries like:
```
create table IF NOT EXISTS t(t1 INT,t2 int) using json;
CREATE TABLE IF NOT EXISTS a (a1 INT) using json;
select 1
from t as t_outer
left join
lateral(
select b1,b2
from
(
select
a.a1 as b1,
1 as b2
from a
union
select t_outer.t1 as b1,
null as b2
) as t_inner
where (t_inner.b1 < t_outer.t2 or t_inner.b1 is null) and t_inner.b1 = t_outer.t1
order by t_inner.b1,t_inner.b2 desc limit 1
) as lateral_table
```
And the stack error trace is:
org.apache.spark.SparkException: <Redacted Exception Message> at org.apache.spark.SparkException$.internalError(SparkException.scala:97) at org.apache.spark.SparkException$.internalError(SparkException.scala:101) at org.apache.spark.sql.catalyst.optimizer.DecorrelateInnerQuery$.rewriteDomainJoins(DecorrelateInnerQuery.scala:447) at org.apache.spark.sql.catalyst.optimizer.DecorrelateInnerQuery$.$anonfun$rewriteDomainJoins$7(DecorrelateInnerQuery.scala:463) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1308) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1307) at org.apache.spark.sql.catalyst.plans.logical.Project.mapChildren(basicLogicalOperators.scala:87) at org.apache.spark.sql.catalyst.optimizer.DecorrelateInnerQuery$.rewriteDomainJoins(DecorrelateInnerQuery.scala:463) at org.apache.spark.sql.catalyst.optimizer.DecorrelateInnerQuery$.$anonfun$rewriteDomainJoins$5(DecorrelateInnerQuery.scala:453) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:744) at org.apache.spark.sql.catalyst.optimizer.DecorrelateInnerQuery$.rewriteDomainJoins(DecorrelateInnerQuery.scala:451) at org.apache.spark.sql.catalyst.optimizer.DecorrelateInnerQuery$.$anonfun$rewriteDomainJoins$7(DecorrelateInnerQuery.scala:463) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1308) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1307) at org.apache.spark.sql.catalyst.plans.logical.Aggregate.mapChildren(basicLogicalOperators.scala:1470) at org.apache.spark.sql.catalyst.optimizer.DecorrelateInnerQuery$.rewriteDomainJoins(DecorrelateInnerQuery.scala:463) at org.apache.spark.sql.catalyst.optimizer.DecorrelateInnerQuery$.$anonfun$rewriteDomainJoins$7(DecorrelateInnerQuery.scala:463) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1308) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1307) at org.apache.spark.sql.catalyst.plans.logical.Filter.mapChildren(basicLogicalOperators.scala:344) at org.apache.spark.sql.catalyst.optimizer.DecorrelateInnerQuery$.rewriteDomainJoins(DecorrelateInnerQuery.scala:463) at org.apache.spark.sql.catalyst.optimizer.DecorrelateInnerQuery$.$anonfun$rewriteDomainJoins$7(DecorrelateInnerQuery.scala:463)
...
See this investigation doc for more context:
https://docs.google.com/document/d/1HtBDPKVD6pgGntTXdPVX27xH7PdcKTYNyQJLnwr7T-U/edit?usp=sharing
Attachments
Issue Links
- links to