Uploaded image for project: 'Kylin'
  1. Kylin
  2. KYLIN-3361

Add a two layer udaf stddev_sum

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • v3.1.0
    • None
    • Sprint 52

    Description

      (x 1 - x) 2 + (x 2 - x) 2 + ... + (x n - x) 2 = x 1 2 + x 2 2 + ... + x n 2 - n x 2, where x is the average of x 1, x 2, ..., x n. Therefore, to compute stddev, what kylin need to do is to pre-calculate sum(x i 2), sum(x i) and count

       

      var(X) = E(X 2) - E(X) 2

      var '(X) = n * var(X)

                   = n*(E(X 2) - E(X) 2)

                   = S(X 2) - S(X) 2/n

                   = S(X 1 2) + S(X 2 2) - S(X 1 + X 2) 2/(n 1 + n 2)

                   = S(X 1 2) - S(X 1 ) 2/(n 1) + S(X 2 2) - S(X 2) 2/(n 2)  + S(X 1) 2/(n 1) + S(X 2) 2/(n 2) - S(X 1 + X 2) 2/(n 1 + n 2)

                   = var '(X 1) + var '(X 2) + (S(X 2)n 1 -  S(X 1)n 2) 2 / (n 1 n 2 (n 1 + n 2))

                   = var '(X 1) + var '(X 2) + (S(X 2) - S(X 1) n 2 / n 1 ) 2 n 1 / (n 2 (n 1 + n 2))

      Attachments

        Activity

          People

            yaho Zhong Yanghong
            yaho Zhong Yanghong
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: