Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-18073

AVG CQL function of an empty set of values returns zero

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Triage Needed
    • Normal
    • Resolution: Unresolved
    • None
    • CQL/Semantics
    • None
    • All
    • None

    Description

      The CQL native aggregate function avg returns zero when it's applied to an empty set of values:

      > CREATE TABLE t (k int PRIMARY KEY, v int);
      > SELECT avg(v) FROM t;
      
       system.avg(v)
      ---------------
                   0
      

      The new collection_avg that has just been added by CASSANDRA-18060 is based on the avg implementation, so both are consistent. Thus, it will also return zero for an empty collection:

      > CREATE TABLE t (k int PRIMARY KEY, v frozen<set<int>>);
      > INSERT INTO t (k,v) VALUES(1, {});
      > SELECT collection_avg(v) FROM t;
      
       system.collection_avg(v)
      --------------------------
                              0
      

      I think these functions should probably better return NaN instead of zero.

      However, returning zero is not terribly incorrect, and returning NaN might be problematic for backward compatibility.

      Also, to further complicate things, BigInteger and BigDecimal don't have a NaN value, so avg for varint and decimal should have a different behaviour, such as:

      • Keep returning zero
      • Return null
      • Throw an exception
         

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              adelapena Andres de la Peña
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: