Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-945

The variance calculation of Random forest regression tree

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.6
    • 0.8
    • None

    Description

      Hi, Mukai
      Thanks for your efforts in expand the RF to regression. However, I have a doubt about your implementation regarding to Regressionsplit.java. The variance method
      "
      private static double variance(double[] s, double[] ss, double[] dataSize) {
      double var = 0;
      for (int i = 0; i < s.length; i++) {
      if (dataSize[i] > 0)

      { var += ss[i] - ((s[i] * s[i]) / dataSize[i]); }

      }
      return var;
      }
      "

      While the variance in my mind should be something like
      var += ss[i]/dataSize[i] - ((s[i] * s[i]) / (dataSize[i]*dataSize[i]));

      Please help correct me if I am wrong. Thanks

      Attachments

        1. MAHOUT-945.patch
          9 kB
          Ikumasa Mukai
        2. MAHOUT-945.patch
          12 kB
          Ikumasa Mukai
        3. MAHOUT-945.patch
          12 kB
          Ikumasa Mukai

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            srowen Sean R. Owen
            fayue1015 Wang Yue
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 48h
                48h
                Remaining:
                Remaining Estimate - 48h
                48h
                Logged:
                Time Spent - Not Specified
                Not Specified

                Slack

                  Issue deployment