Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-830

add DenseVector and SparseVector to mllib, and replace all Array[Double] with Vectors

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 1.0.0
    • Component/s: Spark Core
    • Labels:
      None

      Description

      currently machine learning models in mllib package use raw Array[Double] directly which is not portable and elegant.

      Replacing arrays with vectors can provide the following benefits:
      1. Higher Performance. When the data are dense vectors, using array is fine, but when the data is sparse, using SparseVector can gain higher performance

      2. Higher abstraction. Vectors can provide higher abstractions, which are elegant and intuitive, while Array[Double] is verbose.

        Attachments

          Activity

            People

            • Assignee:
              mengxr Xiangrui Meng
              Reporter:
              soulmachine Frank Dai
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: