Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18862

Split SparkR mllib.R into multiple files

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.2.0
    • ML, SparkR
    • None

    Description

      SparkR mllib.R is getting bigger as we add more ML wrappers, I'd like to split it into multiple files to make us easy to maintain:

      • mllibClassification.R
      • mllibRegression.R
      • mllibClustering.R
      • mllibFeature.R

      or:

      • mllib/classification.R
      • mllib/regression.R
      • mllib/clustering.R
      • mllib/features.R

      For R convention, it's more prefer the first way. And I'm not sure whether R supports the second organized way (will check later). Please let me know your preference. I think the start of a new release cycle is a good opportunity to do this, since it will involves less conflicts. If this proposal was approved, I can work on it.

      cc felixcheung josephkb mengxr

      Attachments

        Activity

          People

            yanboliang Yanbo Liang
            yanboliang Yanbo Liang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: