Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6915 VectorIndexer improvements
  3. SPARK-12375

VectorIndexer: allow unknown categories

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.3.0
    • ML
    • None

    Description

      Add option for allowing unknown categories, probably via a parameter like "allowUnknownCategories."
      If true, then handle unknown categories during transform by assigning them to an extra category index.

      The API should resemble the API used for StringIndexer.

      Attachments

        Issue Links

          Activity

            People

              weichenxu123 Weichen Xu
              josephkb Joseph K. Bradley
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: