Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3
    • Labels:
      None

      Description

      It might be useful to have a StatUtils function to get the mode.

      However, this may be tricky as it does not easily fit in with the current StatUtils design.

      The mode can have multiple elements, but all the other methods only return a single value.

      There are at least two options for converting an array to a single value:

      • arbitrarily (or randomly) pick one
      • throw an Exception if there is more than one mode

      Or maybe StatUtils should return a double array.

      StatUtils also uses nested classes for all but the difference and normalize methods. However the standard interfaces and classes don't support returning arrays.

      1. MATH-1007t.patch
        2 kB
        Phil Steitz
      2. MATH-1007.patch
        4 kB
        Sebb

        Issue Links

          Activity

          Hide
          Sebb added a comment -

          Tests found bug in static double[] getMode method - had forgotten to increment the array index. Oops!

          URL: http://svn.apache.org/r1504495
          Log:
          MATH-1007 Add mode function to StatUtils class

          Modified:
          commons/proper/math/trunk/src/changes/changes.xml
          commons/proper/math/trunk/src/main/java/org/apache/commons/math3/stat/StatUtils.java
          commons/proper/math/trunk/src/test/java/org/apache/commons/math3/stat/StatUtilsTest.java

          Show
          Sebb added a comment - Tests found bug in static double[] getMode method - had forgotten to increment the array index. Oops! URL: http://svn.apache.org/r1504495 Log: MATH-1007 Add mode function to StatUtils class Modified: commons/proper/math/trunk/src/changes/changes.xml commons/proper/math/trunk/src/main/java/org/apache/commons/math3/stat/StatUtils.java commons/proper/math/trunk/src/test/java/org/apache/commons/math3/stat/StatUtilsTest.java
          Hide
          Phil Steitz added a comment -

          Looks good. Here are some tests.

          Show
          Phil Steitz added a comment - Looks good. Here are some tests.
          Hide
          Sebb added a comment - - edited

          Sample implementation - this does not use a separate helper class, and does its own parameter validation.

          Show
          Sebb added a comment - - edited Sample implementation - this does not use a separate helper class, and does its own parameter validation.
          Hide
          Sebb added a comment -

          OK. Minor clarifications:

          s/in the first element/as the only element/

          s/array of maximum frequency elements/array of the most frequently occuring element(s)/

          ==

          I'd not considered NaNs - not sure that the Frequency implementation ignores them; need to check.

          What about null array?
          Most of the other methods return MathIllegalArgumentException for this.

          Also, do you want to support begin and length params like many of the other methods?

          Show
          Sebb added a comment - OK. Minor clarifications: s/in the first element/as the only element/ s/array of maximum frequency elements/array of the most frequently occuring element(s)/ == I'd not considered NaNs - not sure that the Frequency implementation ignores them; need to check. What about null array? Most of the other methods return MathIllegalArgumentException for this. Also, do you want to support begin and length params like many of the other methods?
          Hide
          Phil Steitz added a comment -

          I think it is best to return a double[]. Here is what I propose as method signature / contract:

          /**
               * Returns the sample mode(s).  The mode is the most frequently occurring
               * value in the sample. If there is a unique value with maximum frequency,
               * this value is returned in the first element of the output array. Otherwise,
               * the returned array contains the maximum frequency elements in increasing
               * order.  For example, if {@code sample} is {0, 12, 5, 6, 0, 13, 5, 17},
               * the returned array will have length two, with 0 in the first element and
               * 5 in the second.
               *
               * <p>NaN values are ignored when computing the mode - i.e., NaNs will never
               * appear in the output array.  If the sample includes only NaNs or has
               * length 0, an empty array is returned.</p>
               *
               * @param sample input data
               * @return array of maximum frequency elements sorted in ascending order.
               */
              public static double[] mode(final double[] sample)
          
          
          Show
          Phil Steitz added a comment - I think it is best to return a double[]. Here is what I propose as method signature / contract: /** * Returns the sample mode(s). The mode is the most frequently occurring * value in the sample. If there is a unique value with maximum frequency, * this value is returned in the first element of the output array. Otherwise, * the returned array contains the maximum frequency elements in increasing * order. For example, if {@code sample} is {0, 12, 5, 6, 0, 13, 5, 17}, * the returned array will have length two, with 0 in the first element and * 5 in the second. * * <p>NaN values are ignored when computing the mode - i.e., NaNs will never * appear in the output array. If the sample includes only NaNs or has * length 0, an empty array is returned.</p> * * @param sample input data * @ return array of maximum frequency elements sorted in ascending order. */ public static double [] mode( final double [] sample)

            People

            • Assignee:
              Unassigned
              Reporter:
              Sebb
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development