Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4826

Query against INFORMATION_SCHEMA.TABLES degrades as the number of views increases

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.9.0
    • Component/s: None
    • Labels:
      None

      Description

      Queries against INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.VIEWS slow down as the number of views increases.

      BI tools like Tableau issue a query like the following at connection time:

      select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from INFORMATION_SCHEMA.`TABLES` WHERE TABLE_CATALOG LIKE 'DRILL' ESCAPE '\' AND TABLE_SCHEMA <> 'sys' AND TABLE_SCHEMA <> 'INFORMATION_SCHEMA'ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME
      

      The time to query the information schema tables degrades as the number of views increases. On a test system:

      Views Time(secs)
      500 6
      1000 19
      1500 33

      This can result in a single connection taking more than a minute to establish.

      The problem occurs because we read the view file for every view and this appears to take most of the time.

      Querying information_schema.tables does not, in fact, need to open the view file at all, it merely needs to get a listing of the view files. Eliminating the view file read will speed up the query tremendously.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                parthc Parth Chandra
                Reporter:
                parthc Parth Chandra
                Reviewer:
                Dechang Gu
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: