Details
-
Sub-task
-
Status: Resolved
-
Resolution: Duplicate
-
None
-
None
-
None
-
None
-
64
Description
Our current stats gathering is way too simplistic - it's only keeping a cache per client connection to a cluster for the min and max key for a table. Instead, we should:
1. have a system table that stores the stats
2. create a coprocessor that updates the stats during compaction (i.e. using the preCompactSelection, postCompactSelection, preCompact, postCompact methods)
3. keep a kind of histogram - the key boundary of every N bytes within a region. Perhaps we can do a delta update on minor compaction and a complete update on major compaction.
4. keep the min key/max key of a table in the stats table too