Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-23634 Enable "Split WAL to HFile" by default
  3. HBASE-24619

Try compact the recovered hfiles firstly after region online

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.3.0
    • None
    • None
    • None

    Description

      As discussed in HBASE-23739 and in HBASE-24632, there may have many recovered hfiles. Should find a better way to compact them firstly after region online.

       

      For instance (quoting our anoop.hbase):

      "Assume there were some small files because of flush but never got compacted before the RS down happened. We will look for the possible candidate from oldest files and in all chance the very old files would get excluded because of the size math. But It is possible that new flushed files would get selected. And we have the max files to compact config also which is 10 by default. Even these small files count alone might be >10. If there are say 15 WAL files to split, for sure we will have at least 15 small HFiles.
      My thinking was this. After the region open, we have to make sure these small files are compacted in one go and we should not even consider the max files limit for this compaction. Also to note that this files might not even have the DBE/compression etc being applied. Ya coding wise am not sure how clean it might come."

       

      And from our pankaj2461

       

      "...concern is the compaction after region open, which impact MTTR due to heavy IO in large cluster with many outstanding WALs"

       

      Attachments

        Issue Links

          Activity

            People

              zghao Guanghao Zhang
              zghao Guanghao Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: