Description
Based on this SPARK-34638 Maybe I found another nested fields pruning case. In this case I found full read with `count` function
Example:
1) Loading data
val jsonStr = """{ "items": [ {"itemId": 1, "itemData": "a"}, {"itemId": 2, "itemData": "b"} ] }""" val df = spark.read.json(Seq(jsonStr).toDS) df.write.format("parquet").mode("overwrite").saveAsTable("persisted")
2) read query with explain
val read = spark.table("persisted") spark.conf.set("spark.sql.optimizer.nestedSchemaPruning.enabled", true) read.select(explode($"items").as('item)).select(count(lit(true))).explain(true) // ReadSchema: struct<items:array<struct<itemData:string,itemId:bigint>>>