[SPARK-19408] cardinality estimation involving two columns of the same table - ASF JIRA

Rank to Top

Rank to Bottom

Attach files

Attach Screenshot

Bulk Copy Attachments

Bulk Move Attachments

Voters

Watch issue

Watchers

Convert to Issue

Link

Clone

Labels

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.1.0
Fix Version/s: 2.2.0
Component/s: Optimizer, SQL
Labels:
None

Target Version/s:

2.2.0

Description

In ~~SPARK-17075~~, we estimate cardinality of predicate expression "column (op) literal", where op is =, <, <=, >, >= or <=>. In SQL queries, we also see predicate expressions involving two columns such as "column-1 (op) column-2" where column-1 and column-2 belong to same table. Note that, if column-1 and column-2 belong to different tables, then it is a join operator's work, NOT a filter operator's work.

In this jira, we want to estimate the filter factor of predicate expressions involving two columns of same table. For example, multiple tpc-h queries have this kind of predicate "WHERE l_commitdate < l_receiptdate".