Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4841

Impala and Hive significant difference in result while calculating correlation coefficient with the help of SD and AVG

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Information Provided
    • Impala 2.0, Impala 2.7.0
    • None
    • Frontend

    Description

      Hadoop Setup :-
      8 node Cluster with CDH 5.9
      Hive Version :- Hive 1.1.0-cdh5.9.0
      Compiled by jenkins on Fri Oct 21 00:54:46 PDT 2016
      From source with checksum 9c5d0bee25fab27d28098c3080f8aedc

      Impala Version :-
      Impala v2.7.0-cdh5.9.0 (4b4cf19) built on Fri Oct 21 01:07:22 PDT 2016

      Issue:-
      I ran same query on Hive and Impala (through Hue, Screenshot attached)

      SELECT (AVG(cost_of_liquidity_provision*risk_of_liquidity_provision)- AVG(cost_of_liquidity_provision)*AVG(risk_of_liquidity_provision))/ (1.00000000*STDDEV_POP(cost_of_liquidity_provision)*STDDEV_POP(risk_of_liquidity_provision)) AS corr_coeff
      FROM liquidity
      

      The table is in parquet format (non partitioned)

      I got different output in two different runs
      Hive :- 0.8465 (correct as verified by external application e.g. R)
      Impala :- 0.0636

      Similar thing happened with another query too.

      Attachments

        1. Impala and Hive.PNG
          101 kB
          Indranil Gayen
        2. IMPALA-4841 - 1.PNG
          65 kB
          Indranil Gayen
        3. IMPALA-4841 - 2.PNG
          171 kB
          Indranil Gayen
        4. IMPALA-4841 - 3.PNG
          205 kB
          Indranil Gayen
        5. Final_Store.sql
          4 kB
          Indranil Gayen
        6. Final_Store_Hive_Parquet.sql
          5 kB
          Indranil Gayen
        7. Impala_Hive_Query_Results.PNG
          205 kB
          Indranil Gayen
        8. Impala_Hive_Same_Table_From Hive.PNG
          171 kB
          Indranil Gayen
        9. Impala_Hive_Table_Structure.PNG
          65 kB
          Indranil Gayen

        Activity

          People

            grahn Greg Rahn
            bedantaguru_impala_7369 Indranil Gayen
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: