Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-27590

Change Iterable to List in SnapshotFileCache

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.4
    • None
    • None

    Description

      The HFileCleaners can have low performance on large /archive area when used with slow storage like S3. The snapshot write lock in SnapshotFileCache is held while the file metadata is fetched from S3. Due to this even with multiple cleaner threads only a single cleaner can effectively delete files from the archive.

      File metadata collection is performed before SnapshotHFileCleaner just by changing the passed parameter type in FileCleanerDelegate from Iterable to List.

      Running with the below cleaner configurations I observed that the lock held in SnapshotFileCache went down from 45000ms to 100msĀ  when it was running for 1000 files in a directory. The complete evaluation and deletion for this folder took the same time but since the file metadata fetch from S3 was done outside of the lock the multiple cleaner threads were able to run concurrently.

      hbase.cleaner.directory.sorting=false
      hbase.cleaner.scan.dir.concurrent.size=0.75
      hbase.regionserver.hfilecleaner.small.thread.count=16
      hbase.regionserver.hfilecleaner.large.thread.count=8
      

      The files to evaluate are already passed in a List to CleanerChore.checkAndDeleteFiles but it is converted to an Iterable to run the checks on the configured cleaners.

      Attachments

        1. flame-1.html
          315 kB
          Peter Somogyi

        Issue Links

          Activity

            People

              psomogyi Peter Somogyi
              psomogyi Peter Somogyi
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: