[SPARK-18390] Optimized plan tried to use Cartesian join when it is not enabled - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 2.0.1
Fix Version/s: None
Component/s: SQL
Labels:
None

Description

val df2 = spark.range(1e9.toInt).withColumn("one", lit(1))
val df3 = spark.range(1e9.toInt)
df3.join(df2, df3("id") === df2("one")).count()

throws

org.apache.spark.sql.AnalysisException: Cartesian joins could be prohibitively expensive and are disabled by default. To explicitly enable them, please set spark.sql.crossJoin.enabled = true;

This is probably not the right behavior because it was not the user who suggested using cartesian product. SQL picked it while knowing it is not enabled.

Attachments

Issue Links

relates to

SPARK-17298 Require explicit CROSS join for cartesian products by default

Resolved

Activity

People

Assignee:: Srinath

Reporter:: Xiangrui Meng

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 09/Nov/16 19:43

Updated:: 09/Nov/16 23:26

Resolved:: 09/Nov/16 23:25