Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4204

Intermittent precision in Druid results when using aggregation functions over columns of type DOUBLE

    XMLWordPrintableJSON

Details

    Description

      Queries with aggregation functions on columns of type DOUBLE return results where the precision of the columns involved in the computation may differ from one execution to the other.

      Consider the following query which can be found under DruidAdapterIT#testSingleAverageFunction.

      select "store_state", sum("store_cost") / count(*) as a
      from "foodmart" 
      group by "store_state" 
      order by a desc
      

      The same query executed multiple times returns different results

      Result 1

      store_state=OR; A=2.6271402406293403
      store_state=CA; A=2.599338206292706
      store_state=WA; A=2.582870859286872
      

      Result 2

      store_state=OR; A=2.62714024062934
      store_state=CA; A=2.599338206292706
      store_state=WA; A=2.582870859286872
      

      Result 3

      store_state=OR; A=2.6271402406293394
      store_state=CA; A=2.599338206292706
      store_state=WA; A=2.582870859286872
      

      Column "store_cost" in Druid is defined as shown below:

      "metricsSpec" : [
      ...
      {
        "name" : "store_sales",
        "type" : "doubleSum",
        "fieldName" : "store_sales"
      }
      ...]
      

       

      Attachments

        Issue Links

          Activity

            People

              julianhyde Julian Hyde
              zabetak Stamatis Zampetakis
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: