Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13804

MutableStat mean loses accuracy if add(long, long) is used

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.6.5
    • Fix Version/s: 2.8.0, 2.7.4, 3.0.0-alpha2
    • Component/s: metrics
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Currently if the MutableStat.add(long numSamples, long sum) method is used with a large sample count, the mean that is returned will be very inaccurate. This is a result of using the Welford method for variance calculation, which assumes that each sample is processed on its own, to calculate the mean as well. For variance this is fine, since variance numbers lose meaning if you add many samples at once, but the mean should still be accurate.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                xkrogen Erik Krogen
                Reporter:
                xkrogen Erik Krogen
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: