Details
-
Bug
-
Status: Triage Needed
-
Normal
-
Resolution: Unresolved
-
None
-
None
-
All
-
None
Description
The CQL native aggregate function avg returns zero when it's applied to an empty set of values:
> CREATE TABLE t (k int PRIMARY KEY, v int); > SELECT avg(v) FROM t; system.avg(v) --------------- 0
The new collection_avg that has just been added by CASSANDRA-18060 is based on the avg implementation, so both are consistent. Thus, it will also return zero for an empty collection:
> CREATE TABLE t (k int PRIMARY KEY, v frozen<set<int>>); > INSERT INTO t (k,v) VALUES(1, {}); > SELECT collection_avg(v) FROM t; system.collection_avg(v) -------------------------- 0
I think these functions should probably better return NaN instead of zero.
However, returning zero is not terribly incorrect, and returning NaN might be problematic for backward compatibility.
Also, to further complicate things, BigInteger and BigDecimal don't have a NaN value, so avg for varint and decimal should have a different behaviour, such as:
- Keep returning zero
- Return null
- Throw an exception
Attachments
Issue Links
- Discovered while testing
-
CASSANDRA-18060 Add aggregation scalar functions on collections
- Resolved