Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7028

Reduce the planning time of queries on large Parquet tables with large metadata cache files

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.17.0
    • Component/s: Metadata
    • Labels:

      Description

      If the Parquet table has a large number of small files, the metadata cache files grow larger and the planner tries to read the large metadata cache file which leads to the planning time overhead. Most of the time of execution is spent during the planning phase.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                vdonapati Venkata Jyothsna Donapati
                Reporter:
                vdonapati Venkata Jyothsna Donapati
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 1,284h
                  1,284h
                  Remaining:
                  Remaining Estimate - 1,284h
                  1,284h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified