Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
ghx-label-12
Description
These should be built-in functions that use DataSketches functionality that was integrated by IMPALA-9631.
ds_hll_sketch() should receive a primitive expression and return a sketch.
ds_hll_estimate() should receive a sketch and return a primitive that is the cardinality estimate for that set of data provided to the sketch.
Usage:
select ds_hll_estimate(ds_hll_sketch(col_name)) from table_name;
Returns a cardinality estimate (similarly to ndv() ) for that particular column.
Hive change that introduced the same: https://issues.apache.org/jira/browse/HIVE-22940