[SPARK-33544] explode should not filter when used with CreateArray - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1.0
Fix Version/s: 3.1.0
Component/s: SQL
Labels:
None

Description

https://issues.apache.org/jira/browse/SPARK-32295 added in an optimization to insert a filter for not null and size > 0 when using inner explode/inline. This is fine in most cases but the extra filter is not needed if the explode is with a create array and not using Literals (it already handles LIterals). When this happens you know that the values aren't null and it has a size. It already handles the empty array.

for instance:

val df = someDF.selectExpr("number", "explode(array(word, col3))")

So in this case we shouldn't be inserting the extra Filter and that filter can get pushed down into like a parquet reader as well. This is just causing extra overhead.

Attachments

Issue Links

contains

SPARK-24913 Make `AssertTrue` and `AssertNotNull` non-deterministic

Resolved

relates to

SPARK-32295 Add not null and size > 0 filters before inner explode to benefit from predicate pushdown

Resolved

links to

[Github] Pull Request #30504 (tgravescs)

[Github] Pull Request #30570 (HyukjinKwon)

Activity

People

Assignee:: Thomas Graves

Reporter:: Thomas Graves

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 24/Nov/20 20:55

Updated:: 12/Dec/22 18:10

Resolved:: 02/Dec/20 00:54