Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3455

ANSI CORR(X,Y) is incorrect

Log workAgile BoardRank to TopRank to BottomVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.1, 0.8.0, 0.8.1, 0.9.0, 0.9.1, 0.10.0, 0.11.0, 0.12.0
    • Fix Version/s: 0.13.0
    • Component/s: UDF
    • Labels:
    • Release Note:
      Hide
      the patch for the
      src/ql/src/java/org/apache/hadoop/hive/ql/udf/generic
      Show
      the patch for the src/ql/src/java/org/apache/hadoop/hive/ql/udf/generic
    • Tags:
      correlation UDAF

      Description

      A simple test with 2 collinear vectors returns a wrong result.
      The problem is the merge of variances, file:

      http://svn.apache.org/viewvc/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCorrelation.java?revision=1157222&view=markup

      lines:
      347: myagg.xvar += xvarB + (xavgA - xavgB) * (xavgA - xavgB) * myagg.count;
      348: myagg.yvar += yvarB + (yavgA - yavgB) * (yavgA - yavgB) * myagg.count;

      the correct merge should be like this:
      347: myagg.xvar += xvarB + (xavgA - xavgB) * (xavgA - xavgB) / myagg.count * nA * nB;
      348: myagg.yvar += yvarB + (yavgA - yavgB) * (yavgA - yavgB) / myagg.count * nA * nB;

        Attachments

        1. HIVE-3455.1.patch.txt
          6 kB
          Navis Ryu
        2. HIVE3455.corrTest.tar.gz
          3 kB
          Jon Hartlaub
        3. my.patch
          2 kB
          Maxim Bolotin

        Issue Links

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

            • Assignee:
              maximbo Maxim Bolotin Assign to me
              Reporter:
              maximbo Maxim Bolotin

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment