Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4826

Query against INFORMATION_SCHEMA.TABLES degrades as the number of views increases

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.9.0
    • None
    • None

    Description

      Queries against INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.VIEWS slow down as the number of views increases.

      BI tools like Tableau issue a query like the following at connection time:

      select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from INFORMATION_SCHEMA.`TABLES` WHERE TABLE_CATALOG LIKE 'DRILL' ESCAPE '\' AND TABLE_SCHEMA <> 'sys' AND TABLE_SCHEMA <> 'INFORMATION_SCHEMA'ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME
      

      The time to query the information schema tables degrades as the number of views increases. On a test system:

      Views Time(secs)
      500 6
      1000 19
      1500 33

      This can result in a single connection taking more than a minute to establish.

      The problem occurs because we read the view file for every view and this appears to take most of the time.

      Querying information_schema.tables does not, in fact, need to open the view file at all, it merely needs to get a listing of the view files. Eliminating the view file read will speed up the query tremendously.

      Attachments

        Issue Links

          Activity

            People

              parthc Parth Chandra
              parthc Parth Chandra
              dgu-atmapr dgu-atmapr
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: