Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24649

Optimise Hive::addWriteNotificationLog for large data inserts

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Implemented
    • None
    • None
    • HiveServer2

    Description

      When loading dynamic partition with large dataset, it spends lot of time in "Hive::loadDynamicPartitions --> addWriteNotificationLog".

      Though it is for same for same table, it ends up loading table and partition details for every partition and writes to notification log.

      Also, "Partition" details may be already present in PartitionDetails object in Hive::loadDynamicPartitions. This is unnecessarily recomputed again in HiveMetaStore::add_write_notification_log

       
      Lines of interest:
      https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L3028
      https://github.com/apache/hive/blob/89073a94354f0cc14ec4ae0a43e05aae29276b4d/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8500
       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rajesh.balamohan Rajesh Balamohan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: