Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6915 VectorIndexer improvements
  3. SPARK-12375

VectorIndexer: allow unknown categories

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.3.0
    • Component/s: ML
    • Labels:
      None

      Description

      Add option for allowing unknown categories, probably via a parameter like "allowUnknownCategories."
      If true, then handle unknown categories during transform by assigning them to an extra category index.

      The API should resemble the API used for StringIndexer.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                WeichenXu123 Weichen Xu
                Reporter:
                josephkb Joseph K. Bradley
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: