Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10573

IndexToString transformSchema adds output field as DoubleType

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.5.0
    • 1.5.1, 1.6.0
    • ML
    • None

    Description

      Reproducible example:

      val stage = new IndexToString().setInputCol("input").setOutputCol("output")
      val inSchema = StructType(Seq(StructField("input", DoubleType)))
      val outSchema = stage.transformSchema(inSchema)
      assert(outSchema("output").dataType == StringType)
      

      The root cause seems to be that it uses NominalAttribute.toStructField which assumes DoubleType. It would probably be better to just use SchemaUtils.appendColumn and explicitly set the data type.

      Attachments

        Activity

          People

            pnpritchard Nick Pritchard
            pnpritchard Nick Pritchard
            Xiangrui Meng Xiangrui Meng
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: