Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4127

HiveSchema.getSubSchema() should use lazy loading of all the table names

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.5.0
    • None
    • None

    Description

      Currently, HiveSchema.getSubSchema() will pre-load all the table names when it constructs the subschema, even though those tables names are not requested at all. This could cause considerably big performance overhead, especially when the hive schema contains large # of objects (thousands of tables/views are not un-common in some use case).

      In stead, we should change the loading of table names to on-demand. Only when there is a request of get all table names, we load them into hive schema.

      This should help "show schemas", since it only requires the schema name, not the table names in the schema.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jni Jinfeng Ni
            jni Jinfeng Ni
            Dechang Gu Dechang Gu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment