The documentation for the selection algorithm is attached to
HBASE-6371. Entire file is covered by this JIRA, except for most of 3.3.
Just do catch (Exception e) once?
Hmm... strong reason to do so? The code is kind of verbose this way, but it catches only what it intends to catch.
What is the min flush time about?
This is used as the file time for the purposes of assigning files to tiers based on time.
Is 'TierCompaction' the default as it says in the class comment: '+ * Control knobs for default compaction algorithm.'?
No; changed comment.
Why the break out of config for TierCompaction in particular? Will we have to do this pattern for all we'd dynamically config: i.e. break out a Config class when we are already carrying a heavyweight Configuration anyways that is mostly accessible from anywhere?
Do you mean Configuration or CompactionConfiguration by large object?
CompactionConfiguration is base compaction config, it is not just xml-based, it uses runtime store-specific settings. TierBased one adds more on top of that; it seems that Tier-stuff doesn't belong to the main CompactionConfiguration; and main CompactionConfiguration is not as simple as generic Configuration.
It's also Store (e.g. region/cf) specific.
I wonder who is going to take the time to do configuration on each compaction tier? Is this asking a bit much of operators?
The doc in
HBASE-6371 shows examples of separate configuration for tiers; initial scenario for this may have been to compact "middle" files more aggressively than either old
or recent files, so that would require tier tweaking.
"Old" compaction selection policy is on by default so the operators needn't worry
There is no class comment on TierCompactionPolicy, the place I'd go to learn about how TierCompactionPolicy works.
That is an interesting question... there's an exhaustive doc on it, should it be copied to book.xml and referenced in javadoc, or summarized?
Does this policy do anything about making it so leveldb-like, every file may contain all keys in the namespace: i.e. does it make it so a tier is made of files that each contain a distinct subset of the namespace?
No, the similarity in names is misleading, they don't have a lot in common.