Hive
  1. Hive
  2. HIVE-165

Add standard statistical functions

    Details

    • Type: Wish Wish
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Query Processor
    • Labels:
      None

      Description

      The last step in the unholy triumvirate of statistical built-ins is the variance. We already have the n (count) and the mean (avg). I currently have a job or two that filters all of the data into a single reducer which just computes mean/n/variance and writes it to a table...so my guess is that this would be a pretty big speed increase. Not a huge deal though, as computing the variance myself is trivial.

      (Average, variance, and n can be co-computed in one pass, so if you're doing var() you can basically have avg() and count() for free.)

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              David Phillips
              Reporter:
              Adam Kramer
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development