Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7069

Poor performance of transformBinaryInMetadataCache

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.15.0
    • Fix Version/s: 1.16.0
    • Component/s: Metadata
    • Labels:

      Description

      The performance of the method transformBinaryInMetadataCache scales poorly as the table's numbers of underlying files, row-groups and columns grow. This method is invoked during planning of every query using this table.

           A test on a table using 219 directories (each with 20 files), 1 row-group in each file, and 94 columns, measured about 1340 milliseconds.

          The main culprit are the version checks, which take place in every iteration (i.e., about 400k times in the previous example) and involve construction of 6 MetadataVersion objects (and possibly garbage collections).

           Removing the version checks from the loops improved this method's performance on the above test down to about 250 milliseconds.

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ben-zvi Boaz Ben-Zvi
                Reporter:
                ben-zvi Boaz Ben-Zvi
                Reviewer:
                Vova Vysotskyi
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: