Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Backend
    • Labels:
      None

      Description

      Currently, the result type of AVG() is set to be the same as the input type. Instead, it should be related to the rule for divide of DECIMAL result type, since AVG() is essentially a divide of DECIMAL sum by the integral count. This will require probably some additional plumbing through the UDA interface in addition to the frontend work.

        Issue Links

          Activity

          Hide
          kwho Michael Ho added a comment -

          https://github.com/apache/incubator-impala/commit/637cc3e447650a5c9bc4b8dd79ee74fd11459fa2

          IMPALA-4821: Update AVG() for DECIMAL_V2
          This change implements the DECIMAL_V2's behavior for AVG().
          The differences with DECIMAL_V1 are:

          1. The output type has a minimum scale of 6. This is similar
          to MS SQL's behavior which takes the max of 6 and the input
          type's scale. We deviate from MS SQL in the output's precision
          which is always set to 38. We use the smallest precision which
          can store the output. A key insight is that the output of AVG()
          is no wider than the inputs. Precision only needs to be adjusted
          when the scale is augmented. Using a smaller precision avoids
          potential loss of precision in subsequent decimal operations
          (e.g. division) if AVG() is a subexpression. Please note that
          the output type is different from SUM()/COUNT() as the latter
          can have a much larger scale.

          2. Due to a minimum of 6 decimal places for the output,
          AVG() for decimal values whose whole number part exceeds 32
          decimal places (e.g. DECIMAL(38,4), DECIMAL(33,0)) will
          always overflow as the scale is augmented to 6. Certain
          decimal types which work with AVG() in DECIMAL_V1 no longer
          work in DECIMAL_V2.

          Change-Id: I28f5ef0370938440eb5b1c6d29b2f24e6f88499f
          Reviewed-on: http://gerrit.cloudera.org:8080/6038
          Reviewed-by: Dan Hecht <dhecht@cloudera.com>
          Reviewed-by: Alex Behm <alex.behm@cloudera.com>
          Tested-by: Impala Public Jenkins

          Show
          kwho Michael Ho added a comment - https://github.com/apache/incubator-impala/commit/637cc3e447650a5c9bc4b8dd79ee74fd11459fa2 IMPALA-4821 : Update AVG() for DECIMAL_V2 This change implements the DECIMAL_V2's behavior for AVG(). The differences with DECIMAL_V1 are: 1. The output type has a minimum scale of 6. This is similar to MS SQL's behavior which takes the max of 6 and the input type's scale. We deviate from MS SQL in the output's precision which is always set to 38. We use the smallest precision which can store the output. A key insight is that the output of AVG() is no wider than the inputs. Precision only needs to be adjusted when the scale is augmented. Using a smaller precision avoids potential loss of precision in subsequent decimal operations (e.g. division) if AVG() is a subexpression. Please note that the output type is different from SUM()/COUNT() as the latter can have a much larger scale. 2. Due to a minimum of 6 decimal places for the output, AVG() for decimal values whose whole number part exceeds 32 decimal places (e.g. DECIMAL(38,4), DECIMAL(33,0)) will always overflow as the scale is augmented to 6. Certain decimal types which work with AVG() in DECIMAL_V1 no longer work in DECIMAL_V2. Change-Id: I28f5ef0370938440eb5b1c6d29b2f24e6f88499f Reviewed-on: http://gerrit.cloudera.org:8080/6038 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Impala Public Jenkins

            People

            • Assignee:
              kwho Michael Ho
              Reporter:
              dhecht Dan Hecht
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development