Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15957

RFormula supports forcing to index label

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0
    • ML
    • None

    Description

      RFormula will index label only when it is string type currently. If the label is numeric type and we use RFormula to present a classification model, there is no label attributes in label column metadata. The label attributes are useful when making prediction for classification, so we can force to index label by StringIndexer whether it is numeric or string type for classification. Then SparkR wrappers can extract label attributes from label column metadata successfully. This feature can help us to fix bug similar with SPARK-15153.
      For regression, we will still to keep label as numeric type.
      In this PR, we add a param indexLabel to control whether to force to index label for RFormula.

      Attachments

        Issue Links

          Activity

            People

              yanboliang Yanbo Liang
              yanboliang Yanbo Liang
              Joseph K. Bradley Joseph K. Bradley
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: