Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.4
    • Fix Version/s: 0.5
    • Component/s: None
    • Labels:
      None

      Description

      I am occasionally seeing a need to do exponential averaging of values or rates.

      Hbase guys want this as well.

      So it is time to do it. I have a patch that does the averaging of values according to
      http://tdunning.blogspot.com/2011/03/exponential-weighted-averages-with.html

      I will attach that as a patch now and do the rate averaging as well before committing.

        Activity

        Hide
        Dmitriy Lyubimov added a comment - - edited

        Unfortunately, latex server seems to be down for the moment so formulas are not rendering. I don't know if that's an intermediate condition or permanent.

        Ok fixed now. don't know for how long though, looks like googlegroups change access token every so often to fight attacks or something.

        Show
        Dmitriy Lyubimov added a comment - - edited Unfortunately, latex server seems to be down for the moment so formulas are not rendering. I don't know if that's an intermediate condition or permanent. Ok fixed now. don't know for how long though, looks like googlegroups change access token every so often to fight attacks or something.
        Hide
        Ted Dunning added a comment -

        Should be pretty much unconditionally stable for positive time constants. I should have mentioned this, but negative time constants don't make sense, so I spaced the warning.

        What kind of scaling factor do you mean?

        Show
        Ted Dunning added a comment - Should be pretty much unconditionally stable for positive time constants. I should have mentioned this, but negative time constants don't make sense, so I spaced the warning. What kind of scaling factor do you mean?
        Hide
        Dmitriy Lyubimov added a comment - - edited

        Ted,

        I am also using this with slight modifications to enable to use with map-reduce. 2 suggestions i implemented on a side: updates to the past (unordered input w.r.t. to time of sampling, albeit potentially less numerically stable) and combining to use with MR. http://weatheringthrutechdays.blogspot.com/2011/04/follow-up-for-mean-summarizer-post.html. No algorithm in Mahout currently uses MR for summarizing inputs but it might. These improvements allowed to implement Pig functions that run those formulas.

        Also i experimented with yet another biased estimator for binomial sums (similar to use of beta disitribution as a conjugate prior for binomial distribution) that allows to converge on a predefined value P_0 (similar to beta distribution mode converging to 0.5 with n going to 0) under two circumstances: 1) there's a lack of history (as in beta-distribution-based estimate). 2) there's lack of recent history.

        There's probably no immediate use for either in Mahout but both problems seem to be pretty common otherwise.

        Show
        Dmitriy Lyubimov added a comment - - edited Ted, I am also using this with slight modifications to enable to use with map-reduce. 2 suggestions i implemented on a side: updates to the past (unordered input w.r.t. to time of sampling, albeit potentially less numerically stable) and combining to use with MR. http://weatheringthrutechdays.blogspot.com/2011/04/follow-up-for-mean-summarizer-post.html . No algorithm in Mahout currently uses MR for summarizing inputs but it might. These improvements allowed to implement Pig functions that run those formulas. Also i experimented with yet another biased estimator for binomial sums (similar to use of beta disitribution as a conjugate prior for binomial distribution) that allows to converge on a predefined value P_0 (similar to beta distribution mode converging to 0.5 with n going to 0) under two circumstances: 1) there's a lack of history (as in beta-distribution-based estimate). 2) there's lack of recent history. There's probably no immediate use for either in Mahout but both problems seem to be pretty common otherwise.
        Hide
        Lance Norskog added a comment - - edited

        Is this numerically stable? Or rather, in which range is this numerically stable?
        Could there be a scaling factor?

        Show
        Lance Norskog added a comment - - edited Is this numerically stable? Or rather, in which range is this numerically stable? Could there be a scaling factor?
        Hide
        Ted Dunning added a comment -

        Committed. Didn't wait long for reviews because this is pretty trivial stuff. We can reopen or open a new issue if somebody has a problem.

        Show
        Ted Dunning added a comment - Committed. Didn't wait long for reviews because this is pretty trivial stuff. We can reopen or open a new issue if somebody has a problem.
        Hide
        Ted Dunning added a comment -

        This implements time averaging and rate averaging with test coverage for both.

        I will commit shortly if I don't hear otherwise.

        Show
        Ted Dunning added a comment - This implements time averaging and rate averaging with test coverage for both. I will commit shortly if I don't hear otherwise.

          People

          • Assignee:
            Ted Dunning
            Reporter:
            Ted Dunning
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development