Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
ghx-label-9
Description
Iceberg's planFiles() API is very expensive because it involves reading the Avro manifest files. It's especially expensive on object stores, though manifest caching can help here.
Currently we invoke this API two times during table loading (via IcebergUtil.getIcebergFiles()), once in loadAllPartition() and once in loadPartitionStats().
We should just invoke IcebergUtil.getIcebergFiles() once, then pass the result object to loadAllPartition() and loadPartitionStats().
Attachments
Issue Links
- is related to
-
IMPALA-11658 Implement Iceberg manifest caching configuration for Impala
- Resolved