Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non interger type.
Because of the lack the ability of encoding string at once, so I want to use RocksDB & HBase as implementation of streaming distributed dictionary.
- each receiver will own a local dict cache
- all receiver will share a remote dict storage
- we choose to use RocksDB as local dict cache
- we choose to use HBase as remote dict storage
- for each cube, we will create a local dict and a hbase table
- we will create column family both in RocksDB and HBase for each column which occur in COUNT_DISTINCT