[SPARK-26837] Pruning nested fields from object serializers - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.0.0
Fix Version/s: 3.0.0
Component/s: SQL
Labels:
None

Description

In ~~SPARK-26619~~, we make change to prune unnecessary individual serializers when serializing objects. This is extension to ~~SPARK-26619~~. We can further prune nested fields from object serializers if they are not used.

For example, in following query, we only use one field in a struct column:

val data = Seq((("a", 1), 1), (("b", 2), 2), (("c", 3), 3))
val df = data.toDS().map(t => (t._1, t._2 + 1)).select("_1._1")

So, instead of having a serializer to create a two fields struct, we can prune unnecessary field from it.

Attachments

Issue Links

links to

GitHub Pull Request #23740

Activity

People

Assignee:: L. C. Hsieh

Reporter:: L. C. Hsieh

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 06/Feb/19 15:45

Updated:: 27/Feb/19 04:51

Resolved:: 27/Feb/19 04:51