Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-1205

Add gini importance to DT

    XMLWordPrintableJSON

Details

    Description

      From the Breiman resource that we use for random forest:

      Gini importance

      Every time a split of a node is made on variable m the gini impurity criterion for the two descendent nodes is less than the parent node. Adding up the gini decreases for each individual variable over all trees in the forest gives a fast variable importance that is often very consistent with the permutation importance measure.

      We can add a similar measure in our DT code called as impurity_variable_importance.

      Attachments

        Issue Links

          Activity

            People

              riyer Rahul Iyer
              riyer Rahul Iyer
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: