Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4204

Intermittent precision in Druid results when using aggregation functions over columns of type DOUBLE

    XMLWordPrintableJSON

    Details

      Description

      Queries with aggregation functions on columns of type DOUBLE return results where the precision of the columns involved in the computation may differ from one execution to the other.

      Consider the following query which can be found under DruidAdapterIT#testSingleAverageFunction.

      select "store_state", sum("store_cost") / count(*) as a
      from "foodmart" 
      group by "store_state" 
      order by a desc
      

      The same query executed multiple times returns different results

      Result 1

      store_state=OR; A=2.6271402406293403
      store_state=CA; A=2.599338206292706
      store_state=WA; A=2.582870859286872
      

      Result 2

      store_state=OR; A=2.62714024062934
      store_state=CA; A=2.599338206292706
      store_state=WA; A=2.582870859286872
      

      Result 3

      store_state=OR; A=2.6271402406293394
      store_state=CA; A=2.599338206292706
      store_state=WA; A=2.582870859286872
      

      Column "store_cost" in Druid is defined as shown below:

      "metricsSpec" : [
      ...
      {
        "name" : "store_sales",
        "type" : "doubleSum",
        "fieldName" : "store_sales"
      }
      ...]
      

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                julianhyde Julian Hyde
                Reporter:
                zabetak Stamatis Zampetakis
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: