Proposal from Hongbin about Cuboid White List:
- More detail, please refer to Kylin Dev mailing list: firstname.lastname@example.org
Logically, a cube contains cuboids representing all combinations of
dimensions. Apparently, a naive cube building strategy that materializes
all cuboids will easily meet curse-of-dimension problems. Currently Kylin
leverages a strategy called "aggregation groups" to reduce the number of
cuboids need being materialized.
However, if the query pattern is simple and fixed, the "aggregation group"
strategy is still not efficient enough. For example, suppose there're five
dimensions, namely A,B,C,D and E. The data modeler is sure that only
combinations (A,B,C), (D,E), (A,E) will be queried, so he’ll use the
aggregation group tool to optimize his cube definition. However, whatever
aggregation group he chooses, lots of useless combinations would be
With a new strategy called "cuboid whitelist", data modelers can guide
Kylin to only materialize the cuboids he's interested in. Depending on the
whitelist, Kylin will materialize the minimal set of cuboids to cover each
cuboid in the whitelist. To support this, the following functionalities
should be added:
1. Front-end/UI for specifying whitelist members, and persistent them to
2. Enhanced job engine scheduler that will calculate a minimal spanning
build tree based on the whitelist.
3. (OPTIONAL) Enhanced job engine to support dynamic whitelist, trigger new
builds for lately added whitelist members.
---------------- Imported from GitHub ----------------
Created by: lukehan
Created at: Thu Dec 25 13:17:11 CST 2014