Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22081

Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.1
    • Fix Version/s: 4.0.0
    • Component/s: Transactions
    • Labels:
      None

      Description

      if Automatic Compaction is turned on, Initiator thread check for potential table/partitions which are eligible for compactions and run some checks in for loop before requesting compaction for eligibles. Though initiator thread is configured to run at interval 5 min default, in case of many objects it keeps on running as these checks are IO intensive and hog cpu.
      In the proposed changes, I am planning to do
      1. passing less object to for loop by filtering out the objects based on the condition which we are checking within the loop.
      2. Doing Async call using future to determine compaction type(this is where we do FileSystem calls)

        Attachments

        1. HIVE-22081.patch
          12 kB
          Rajkumar Singh
        2. HIVE-22081.04.patch
          9 kB
          Rajkumar Singh
        3. HIVE-21917.03.patch
          9 kB
          Rajkumar Singh
        4. HIVE-21917.02.patch
          9 kB
          Rajkumar Singh
        5. HIVE-21917.01.patch
          13 kB
          Rajkumar Singh

          Activity

            People

            • Assignee:
              Rajkumar Singh Rajkumar Singh
              Reporter:
              Rajkumar Singh Rajkumar Singh
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: