Example for high number of updates on hot regions
Say I have many regions say 100 on a server and a few are getting a lot of updates and the others are getting some updates.
The few that are getting the bulk of the updates will have many map files causing the compaction to limit to the 6 map file limit but will be quick to finish sense we will only be working with say 16-20MB not 64MB+.
If we go from new to old leaveing out the oldest map files that are the largest and take the longest to include in compaction.
The other regions will still tie up the compaction thread for say 10 mins each on regions that only has 3-4 map files becuase it will include the larger map files for compaction.
In that time the two region that are getting lots of updates will be flushing more often meaning they will have many map files.
We will be spending most of our time compaction region that have only a few map files including the larger map files that take the longest our time to compact instead of on the region that have the most map files to compact.
In my example above if all/most the regions flushed a map file and entered in to the que for compaction it would be 16 hours before we got back to the
few regions that had been getting the bulk of the updates. Then when we got back to them we would only be processing 6 of the map files again then leavening many map files for the next compaction
looping and doing all the others again assuming they will get a few flushes over the 16 hours it took to complete the cmopaction on all the regions.
We should try to come up with a simple test outside of Hudson to see real numbers on the time it takes to do a scan on a region say run the test with 10,20,100,500 map files.
With my new idea above we could keep the number of map files under control by only running compaction map files under X size keeping the compaction fast. the test many
This may show that we can handle say 50 medium size map files during a scan with out much impact on speed. if this is the case then we may not have to do major compaction's where we merge all the map files together but once every few days. Except on a split then we would want to do a major compaction soon to remove the out of range data from each new region.
The bottle neck I have seen on compaction is with block compress we am bound by the cpu speed to gzip the map file after compaction. So I would rather run one large compaction once a day or two and have to gzip the biggest part of each region only every few days in place of doing every day or more then once a day. In my mind thats wasting resources gunzip 64MB data add 4MB gzip it. and do this many times a day I thank thats wasting cpu time on gziping the same data over and over again
My idea here is to spend more time on compaction on regions getting more updates then the other regions so we can handle more regions per server.
Currently my example above is based on 100 regions totaling 9GB of compressed data. With that kind of number per server someone wanting to store a TB of compressed data in hbase they would need a vary large number of servers or have low update traffic.
I know we have some other issues on how many regions a server can handle with the open files limits per server and stuff like that but I would like to see this compaction problem fix once and have the most efficient compaction we can for all users and removing it from becoming a issue later down the road. In the end if we go with this new idea it would mean that the compaction's would be faster and use less resources during bulk updates and allow more resources to other task running on the server like map task.
So my proposed idea would be to have two types of compaction's
1. Compact new flushes in to one map files until it reach a size in MB's then leave it for compaction below
2. Compact all map files for a region to together once every x days or if we are child region from a split.