Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.15.0
Description
The performance of the method transformBinaryInMetadataCache scales poorly as the table's numbers of underlying files, row-groups and columns grow. This method is invoked during planning of every query using this table.
A test on a table using 219 directories (each with 20 files), 1 row-group in each file, and 94 columns, measured about 1340 milliseconds.
The main culprit are the version checks, which take place in every iteration (i.e., about 400k times in the previous example) and involve construction of 6 MetadataVersion objects (and possibly garbage collections).
Removing the version checks from the loops improved this method's performance on the above test down to about 250 milliseconds.
Attachments
Issue Links
- links to