Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24805

Compactor: Initiator shouldn't fetch table details again and again for partitioned tables

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Transactions
    • Labels:
      None

      Description

      Initiator shouldn't be fetch table details for all its partitions. When there are large number of databases/tables, it takes lot of time for Initiator to complete its initial iteration and load on DB also goes higher.

      https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L129

      https://github.com/apache/hive/blob/64bb52316f19426ebea0087ee15e282cbde1d852/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L456

      For all the following partitions, table details would be the same. However, it ends up fetching table details from HMS again and again.

      2021-02-22 08:13:16,106 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451899
      2021-02-22 08:13:16,124 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451830
      2021-02-22 08:13:16,140 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452586
      2021-02-22 08:13:16,149 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452698
      2021-02-22 08:13:16,158 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452063
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rajesh.balamohan Rajesh Balamohan
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: