Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Invalid
-
0.90.0
-
None
-
None
-
None
Description
This is something I've seen here since we upgraded to 0.89 since Gets are now Scans, although I don't currently have any strong evidence that it wasn't happening in 0.20.
I saw this when we began getting alerts on the frontend that some requests were taking more than 8 seconds to complete. Even getting a value could take more than 3 minutes in the shell. The first thing I did was major compacting the table that was slow and the problem went away immediately. Looking in the logs, it seems the compaction transformed 2 files of (total) 550MB into 5.2MB. Looking back in the logs for September, it appears that that table was never major compacted and was slowly growing everyday. Some more grepping around showed that quite a few regions were never major compacted.
I'm still looking at the code, but the issue seems to be that the minor compactions are always happening on all store files more than once per day on certain regions, meaning that the oldest timestamp is always smaller than hbase.hregion.majorcompaction and major compactions are never triggered.