[SPARK-37369] Avoid redundant ColumnarToRow transistion on InMemoryTableScan - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.3.0
Fix Version/s: 3.3.0
Component/s: SQL
Labels:
None

Description

We have a rule to insert columnar transition between row-based and columnar query plans. InMemoryTableScanExec can produce columnar output. So if its parent plan isn't columnar, the rule adds a ColumnarToRow between them.

But InMemoryTableScanExec is a special query plan because it can convert from cached batch to columnar batch or row.

For such case, we ask InMemoryTableScanExec to convert cached batch to columnar batch, and then convert to row in the added ColumnarToRow, before the parent query.

So for such case, we can simply ask InMemoryTableScanExec to produce row output instead of a redundant conversion.

```
+- Union
:- ColumnarToRow
: +- InMemoryTableScan i#8, j#9
: +- InMemoryRelation i#8, j#9, StorageLevel(disk, memory, deserialized, 1 replicas)
```

Attachments

Issue Links

links to

[Github] Pull Request #34642 (viirya)

[Github] Pull Request #35061 (linhongliu-db)

Activity

People

Assignee:: L. C. Hsieh

Reporter:: L. C. Hsieh

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 18/Nov/21 04:48

Updated:: 05/Jun/23 16:33

Resolved:: 13/Dec/21 01:50