Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22228

Add support for Array<primitive_type> so from_json can parse

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.2.0
    • None
    • Java API
    • None

    Description

      val inputDS = Seq("""["foo", "bar"]""").toDF
      
      inputDS.printSchema()
      root
       |-- value: string (nullable = true)
      
      

      Input Dataset inputDS

      inputDS.show(false)
      
      value
      -----
      ["foo", "bar"]
      

      Expected output dataset outputDS

      value
      -------
      "foo" |
      "bar" |
      
      

      Tried explode function like below but it doesn't quite work

      inputDS.select(explode(from_json(col("value"), ArrayType(StringType))))
      

      and got the following error

      org.apache.spark.sql.AnalysisException: cannot resolve 'jsontostructs(`value`)' due to data type mismatch: Input schema string must be a struct or an array of structs
      

      Also tried the following

      inputDS.select(explode(col("value")))
      

      And got the following error

      org.apache.spark.sql.AnalysisException: cannot resolve 'explode(`value`)' due to data type mismatch: input to function explode should be array or map type, not StringType
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kant kant kodali
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: