Uploaded image for project: 'OpenNLP'
  1. OpenNLP
  2. OPENNLP-861

Add Chi-Squared Data Indexer for Feature Selection

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Abandoned
    • 1.6.0
    • None
    • Machine Learning

    Description

      Text classification will naturally produce a lot of features. A lot of them are independent of the category, and provide no real information gain in the classification.

      The Chi-Squared feature selection method will allow features that do not pass a threshold for dependency to be removed from the feature list, keeping the feature list a reasonable size without significantly affecting the classification accuracy.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jxihong Joey Hong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: