Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1303

Added discretization capability to MLlib.

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • MLlib
    • None

    Description

      Some time ago, we have commented with Ameet Talwalkar the possibilty of including both Feature Selection and Discretization algorithms to MLlib.

      In this patch we've implemented Entropy Minimization Discretization following the algorithm described in the paper "Multi-interval discretization of continuous-valued attributes for classification learning" by Fayyad and Irani (1993). This is one of the most used Discretizers and is already included in most libraries like Weka, etc. This can be used as base for FS algorims and the NaiveBayes already included in MLlib.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              LIDIAgroup LIDIAgroup
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: