Patch looks good Billy. I haven't tested it because after banging my head against hbase-826, I've learned that this notion of major compaction is a bit more involved than I at first thought (I think you may have known all along how important the difference between minor and major is).
Here is what I learned. While compacting, if we overrun max versions or a cell has expired, we do not let the cell go through to the compacted file. That was fine in the old days, when we always compacted everything. Since we got smarter compacting – i.e. minor compactions only compacting the small files – this behavior can make for malignant results (See towards end of hbase-826 for an illustration).
So, Billy, you need to add passing of the 'force' flag down into the HStore#compact (We should probably rename 'force' as 'majorCompaction' or something?). Then in HStore#compact, we only do the max versions and expiration code IF its a major compaction. Otherwise, we just let ALL cells go through to the compacted files (At runtime, the get and scan respect max versions and expiration times).
I'll be on IRC tomorrow if you want to chat more on this Billy or just write notes into this JIRA and we can back and forth here (If you want, post a rough patch and I can give feedback – that might be best).
Oh, one other thing, there should be no maximum on the amount of files to compact at a time when doing a major compaction, but I think the way your patch is written, there isn't; its only when minor compactions run that there is a limit – is that so?