Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
currently cuboid sharding is based on the hash value of the row key. If we allow computing hash value on a specific column C, then we're actually partitioning the data w.r.t C. The benefit is that if we later a query with filter like "where C = 'xyz'", kylin can skip shards. Also for filter like "where C IN
{many candidates}", kylin can prune candidates sent to different shards
Attachments
Issue Links
- is related to
-
KYLIN-1428 Scalable dictionary to support cardinality of billions
- Closed