Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22947

The method getTableObjectsByName() in HiveMetaStoreClient.java is slow

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Standalone Metastore
    • Labels:
      None

      Description

      The RPC of getTableObjectsByName() in HiveMetaStoreClient.java (https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java#L2111-L2114) is very slow. Specifically, according to an empirical evaluation, to load the complete metadata of all the tables under a database consisting of 40,000 tables, it takes at least 170 seconds for getTableObjectsByName() to complete, whereas it only takes less than 0.5 second for getAllTables() (https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java#L2281-L2288) on the same machine.

      In some use cases, not all the fields under the class of org.apache.hadoop.hive.metastore.api.Table are required. For instance, if a client would only like to determine the type of a table, e.g., an HDFS table or a Kudu table, then it should suffice to only load the field of sd, which is of class org.apache.hadoop.hive.metastore.api.StorageDescriptor. It would be great if getTableObjectsByName() could be made more fine-grained so that only those required fields specified by the client are retrieved, which could also possibly reduce the time spent on this RPC.

      A spreadsheet is also attached (Benchmark_related_to_IMPALA-9363.pdf), where the detailed experimental results are provided. In the experiment, as a client of Hive metastore, the catalogd of Impala calls getTableObjectsByName() to retrieve the complete metadata of tables under a database having 40,000 tables.

       

        Attachments

          Activity

            People

            • Assignee:
              thejas Thejas Nair
              Reporter:
              fangyurao Fang-Yu Rao
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: