Hive
  1. Hive
  2. HIVE-6508

Mismatched results between vector and non-vector mode with decimal field

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.13.0
    • Fix Version/s: 0.13.0, 0.14.0
    • Component/s: Query Processor
    • Labels:
      None

      Description

      Following query has a little mismatch in result as compared to the non-vector mode.

      select d_year, i_brand_id, i_brand,
             sum(ss_ext_sales_price) as sum_agg
      from date_dim
      join store_sales on date_dim.d_date_sk = store_sales.ss_sold_date_sk
      join item on store_sales.ss_item_sk = item.i_item_sk
      where i_manufact_id = 128
        and d_moy = 11
      group by d_year, i_brand, i_brand_id
      order by d_year, sum_agg desc, i_brand_id
      limit 100;
      

      This query is on tpcds data.
      The field ss_ext_sales_price is of type decimal(7,2) and everything else is an integer.

      1. HIVE-6508.1.patch
        41 kB
        Remus Rusanu
      2. HIVE-6508.1.patch
        41 kB
        Jitendra Nath Pandey

        Activity

        Hide
        Remus Rusanu added a comment -

        The value 0 comes in the input vector unscaled (scale 0). As aggregates (SUM, STDxx) are being updated, they us the scale of the input value, not the scale of the input column. So any 0 in the input will round the intermediate fractional part of the intermediate. Final result is off. AVG uses a special scale so is not affected. MIN/MAX use the input value scale, but has no side effects. Fix is to pass in the column scale explictly, rather than assume the input value scale has the column scale. Ultimately the behavior of passing in unscaled 0s is wrong, but this comes from the row-mode join modus-operandi and I don't want to change that. Hardening the aggregates against this case is more robust.

        Show
        Remus Rusanu added a comment - The value 0 comes in the input vector unscaled (scale 0). As aggregates (SUM, STDxx) are being updated, they us the scale of the input value, not the scale of the input column. So any 0 in the input will round the intermediate fractional part of the intermediate. Final result is off. AVG uses a special scale so is not affected. MIN/MAX use the input value scale, but has no side effects. Fix is to pass in the column scale explictly, rather than assume the input value scale has the column scale. Ultimately the behavior of passing in unscaled 0s is wrong, but this comes from the row-mode join modus-operandi and I don't want to change that. Hardening the aggregates against this case is more robust.
        Hide
        Jitendra Nath Pandey added a comment -

        +1. The patch looks good to me.

        Show
        Jitendra Nath Pandey added a comment - +1. The patch looks good to me.
        Hide
        Jitendra Nath Pandey added a comment -

        Same patch re-uploaded to trigger pre-commit build.

        Show
        Jitendra Nath Pandey added a comment - Same patch re-uploaded to trigger pre-commit build.
        Hide
        Sergey Shelukhin added a comment -

        It might make sense to add a test to ./ql/src/test/queries/clientcompare

        Show
        Sergey Shelukhin added a comment - It might make sense to add a test to ./ql/src/test/queries/clientcompare
        Hide
        Remus Rusanu added a comment -

        Sergey Shelukhin There is a new test case testSumDecimalHive6508 the covers possible regression

        Show
        Remus Rusanu added a comment - Sergey Shelukhin There is a new test case testSumDecimalHive6508 the covers possible regression
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12633294/HIVE-6508.1.patch

        ERROR: -1 due to 1 failed/errored test(s), 5374 tests executed
        Failed tests:

        org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby
        

        Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1660/testReport
        Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1660/console

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests exited with: TestsFailedException: 1 tests failed
        

        This message is automatically generated.

        ATTACHMENT ID: 12633294

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12633294/HIVE-6508.1.patch ERROR: -1 due to 1 failed/errored test(s), 5374 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1660/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1660/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed This message is automatically generated. ATTACHMENT ID: 12633294
        Hide
        Remus Rusanu added a comment -

        The failure is unrelated to the patch

        Show
        Remus Rusanu added a comment - The failure is unrelated to the patch
        Hide
        Jitendra Nath Pandey added a comment -

        Committed to trunk. It is a correctness bug, therefore I will port it to hive-13 branch as well.

        Show
        Jitendra Nath Pandey added a comment - Committed to trunk. It is a correctness bug, therefore I will port it to hive-13 branch as well.
        Hide
        Jitendra Nath Pandey added a comment -

        Committed to branch-0.13

        Show
        Jitendra Nath Pandey added a comment - Committed to branch-0.13
        Hide
        Thejas M Nair added a comment -

        This has been fixed in 0.14 release. Please open new jira if you see any issues.

        Show
        Thejas M Nair added a comment - This has been fixed in 0.14 release. Please open new jira if you see any issues.

          People

          • Assignee:
            Remus Rusanu
            Reporter:
            Remus Rusanu
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development