Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7028

Reduce the planning time of queries on large Parquet tables with large metadata cache files

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.17.0
    • Metadata

    Description

      If the Parquet table has a large number of small files, the metadata cache files grow larger and the planner tries to read the large metadata cache file which leads to the planning time overhead. Most of the time of execution is spent during the planning phase.

      Attachments

        Issue Links

          Activity

            People

              vdonapati Venkata Jyothsna Donapati
              vdonapati Venkata Jyothsna Donapati
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 1,284h
                  1,284h
                  Remaining:
                  Remaining Estimate - 1,284h
                  1,284h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified