Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7069

Poor performance of transformBinaryInMetadataCache

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.15.0
    • 1.16.0
    • Metadata

    Description

      The performance of the method transformBinaryInMetadataCache scales poorly as the table's numbers of underlying files, row-groups and columns grow. This method is invoked during planning of every query using this table.

           A test on a table using 219 directories (each with 20 files), 1 row-group in each file, and 94 columns, measured about 1340 milliseconds.

          The main culprit are the version checks, which take place in every iteration (i.e., about 400k times in the previous example) and involve construction of 6 MetadataVersion objects (and possibly garbage collections).

           Removing the version checks from the loops improved this method's performance on the above test down to about 250 milliseconds.

       

      Attachments

        Issue Links

          Activity

            People

              ben-zvi Boaz Ben-Zvi
              ben-zvi Boaz Ben-Zvi
              Vova Vysotskyi Vova Vysotskyi
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: