Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-8774

Add R model formula with basic support as a transformer

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 1.5.0
    • ML, SparkR
    • None

    Description

      To have better integration with SparkR, we can add a feature transformer to support R formula. A list of operators R supports can be find here: http://ww2.coastal.edu/kingw/statistics/R-tutorials/formulae.html

      The initial version should support "~", "+", and "." on numeric columns and we can expand it in the future.

      val formula = new RModelFormula()
        .setFormula("y ~ x + z")
      

      The output should append two new columns: features and label.

      Design doc is posted at https://docs.google.com/document/d/10NZNSEurN2EdWM31uFYsgayIPfCFHiuIu3pCWrUmP_c/edit?usp=sharing, as part of SPARK-6805.

      Attachments

        Issue Links

          Activity

            People

              ekhliang Eric Liang
              mengxr Xiangrui Meng
              Xiangrui Meng Xiangrui Meng
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: