Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23562 RFormula handleInvalid should handle invalid values in non-string columns.
  3. SPARK-23690

VectorAssembler should have handleInvalid to handle columns with null values

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0
    • 2.4.0
    • ML
    • None

    Description

      VectorAssembler only takes in numeric (and vectors (of numeric?)) columns as an input and returns the assembled vector. It currently throws an error if it sees a null value in any column. This behavior also affects `RFormula` that uses VectorAssembler to assemble numeric columns.

      Attachments

        Activity

          People

            yogeshgarg yogesh garg
            yogeshgarg yogesh garg
            Joseph K. Bradley Joseph K. Bradley
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: