Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4826

Query against INFORMATION_SCHEMA.TABLES degrades as the number of views increases

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.9.0
    • None
    • None

    Description

      Queries against INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.VIEWS slow down as the number of views increases.

      BI tools like Tableau issue a query like the following at connection time:

      select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from INFORMATION_SCHEMA.`TABLES` WHERE TABLE_CATALOG LIKE 'DRILL' ESCAPE '\' AND TABLE_SCHEMA <> 'sys' AND TABLE_SCHEMA <> 'INFORMATION_SCHEMA'ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME
      

      The time to query the information schema tables degrades as the number of views increases. On a test system:

      Views Time(secs)
      500 6
      1000 19
      1500 33

      This can result in a single connection taking more than a minute to establish.

      The problem occurs because we read the view file for every view and this appears to take most of the time.

      Querying information_schema.tables does not, in fact, need to open the view file at all, it merely needs to get a listing of the view files. Eliminating the view file read will speed up the query tremendously.

      Attachments

        Issue Links

          Activity

            People

              parthc Parth Chandra
              parthc Parth Chandra
              Dechang Gu Dechang Gu
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: