[SPARK-13219] Pushdown predicate propagation in SparkSQL with join - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 1.4.1, 1.5.2, 1.6.0
Fix Version/s: None
Component/s: SQL
Labels:
None
Environment:

Spark 1.4
Datastax Spark connector 1.4
Cassandra. 2.1.12
Centos 6.6

Description

When 2 or more tables are joined in SparkSQL and there is an equality clause in query on attributes used to perform the join, it is useful to apply that clause on scans for both table. If this is not done, one of the tables results in full scan which can reduce the query dramatically. Consider following example with 2 tables being joined.

CREATE TABLE assets (
    assetid int PRIMARY KEY,
    address text,
    propertyname text
)
CREATE TABLE tenants (
    assetid int PRIMARY KEY,
    name text
)
spark-sql> explain select t.name from tenants t, assets a where a.assetid = t.assetid and t.assetid='1201';
WARN  2016-02-05 23:05:19 org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
== Physical Plan ==
Project [name#14]
 ShuffledHashJoin [assetid#13], [assetid#15], BuildRight
  Exchange (HashPartitioning 200)
   Filter (CAST(assetid#13, DoubleType) = 1201.0)
    HiveTableScan [assetid#13,name#14], (MetastoreRelation element, tenants, Some(t)), None
  Exchange (HashPartitioning 200)
   HiveTableScan [assetid#15], (MetastoreRelation element, assets, Some(a)), None
Time taken: 1.354 seconds, Fetched 8 row(s)

The simple workaround is to add another equality condition for each table but it becomes cumbersome. It will be helpful if the query planner could improve filter propagation.

Attachments

Issue Links

is duplicated by

SPARK-12532 Join-key Pushdown via Predicate Transitivity

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Abhinav Chawade

Votes:: 7 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 05/Feb/16 23:07

Updated:: 27/Feb/17 04:56

Resolved:: 14/Feb/17 03:35