Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22081

Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.1.1
    • 4.0.0-alpha-1
    • Transactions
    • None

    Description

      if Automatic Compaction is turned on, Initiator thread check for potential table/partitions which are eligible for compactions and run some checks in for loop before requesting compaction for eligibles. Though initiator thread is configured to run at interval 5 min default, in case of many objects it keeps on running as these checks are IO intensive and hog cpu.
      In the proposed changes, I am planning to do
      1. passing less object to for loop by filtering out the objects based on the condition which we are checking within the loop.
      2. Doing Async call using future to determine compaction type(this is where we do FileSystem calls)

      Attachments

        1. HIVE-22081.patch
          12 kB
          Rajkumar Singh
        2. HIVE-22081.04.patch
          9 kB
          Rajkumar Singh
        3. HIVE-21917.03.patch
          9 kB
          Rajkumar Singh
        4. HIVE-21917.02.patch
          9 kB
          Rajkumar Singh
        5. HIVE-21917.01.patch
          13 kB
          Rajkumar Singh

        Activity

          People

            Rajkumar Singh Rajkumar Singh
            Rajkumar Singh Rajkumar Singh
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: