Mahout
  1. Mahout
  2. MAHOUT-945

The variance calculation of Random forest regression tree

    Details

      Description

      Hi, Mukai
      Thanks for your efforts in expand the RF to regression. However, I have a doubt about your implementation regarding to Regressionsplit.java. The variance method
      "
      private static double variance(double[] s, double[] ss, double[] dataSize) {
      double var = 0;
      for (int i = 0; i < s.length; i++) {
      if (dataSize[i] > 0)

      { var += ss[i] - ((s[i] * s[i]) / dataSize[i]); }

      }
      return var;
      }
      "

      While the variance in my mind should be something like
      var += ss[i]/dataSize[i] - ((s[i] * s[i]) / (dataSize[i]*dataSize[i]));

      Please help correct me if I am wrong. Thanks

      1. MAHOUT-945.patch
        9 kB
        Ikumasa Mukai
      2. MAHOUT-945.patch
        12 kB
        Ikumasa Mukai
      3. MAHOUT-945.patch
        12 kB
        Ikumasa Mukai

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Sean Owen
              Reporter:
              Wang Yue
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 48h
                48h
                Remaining:
                Remaining Estimate - 48h
                48h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development