Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6340

mllib.IDF for LabelPoints

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Not A Problem
    • Affects Version/s: 1.3.0
    • Fix Version/s: None
    • Component/s: MLlib
    • Labels:
    • Environment:

      python 2.7.8
      pyspark
      OS: Linux Mint 17 Qiana (Cinnamon 64-bit)

      Description

      as per: http://apache-spark-user-list.1001560.n3.nabble.com/Using-TF-IDF-from-MLlib-td19429.html#a19528

      Having the IDF.fit accept LabelPoints would be useful since, correct me if i'm wrong, there currently isn't a way of keeping track of which labels belong to which documents if one needs to apply a conventional tf-idf transformation on labelled text data.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              kian.ho Kian Ho
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: