Description
To have better integration with SparkR, we can add a feature transformer to support R formula. A list of operators R supports can be find here: http://ww2.coastal.edu/kingw/statistics/R-tutorials/formulae.html
The initial version should support "~", "+", and "." on numeric columns and we can expand it in the future.
val formula = new RModelFormula() .setFormula("y ~ x + z")
The output should append two new columns: features and label.
Design doc is posted at https://docs.google.com/document/d/10NZNSEurN2EdWM31uFYsgayIPfCFHiuIu3pCWrUmP_c/edit?usp=sharing, as part of SPARK-6805.
Attachments
Issue Links
- is related to
-
SPARK-9544 RFormula in Python
- Resolved
- relates to
-
SPARK-9895 User Guide for RFormula Feature Transformer
- Resolved
-
SPARK-6805 MLlib + SparkR integration for 1.5
- Closed
- links to