Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13804

MutableStat mean loses accuracy if add(long, long) is used

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.6.5
    • 2.8.0, 2.7.4, 3.0.0-alpha2
    • metrics
    • None
    • Reviewed

    Description

      Currently if the MutableStat.add(long numSamples, long sum) method is used with a large sample count, the mean that is returned will be very inaccurate. This is a result of using the Welford method for variance calculation, which assumes that each sample is processed on its own, to calculate the mean as well. For variance this is fine, since variance numbers lose meaning if you add many samples at once, but the mean should still be accurate.

      Attachments

        1. HADOOP-13804.000.patch
          4 kB
          Erik Krogen

        Issue Links

          Activity

            People

              xkrogen Erik Krogen
              xkrogen Erik Krogen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: