Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27027

from_avro function does not deserialize the Avro record of a struct column type correctly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • 2.4.0, 3.0.0
    • None
    • Spark Shell, SQL
    • None

    Description

      from_avro function produces wrong output of a struct field.  See the output at the bottom of the description

      import org.apache.spark.sql.types._
      import org.apache.spark.sql.avro._
      import org.apache.spark.sql.functions._
      
      
      spark.version
      
      val df = Seq((1, "John Doe", 30), (2, "Mary Jane", 25), (3, "Josh Duke", 50)).toDF("id", "name", "age")
      
      val dfStruct = df.withColumn("value", struct("name","age"))
      
      dfStruct.show
      dfStruct.printSchema
      
      val dfKV = dfStruct.select(to_avro('id).as("key"), to_avro('value).as("value"))
      
      val expectedSchema = StructType(Seq(StructField("name", StringType, true),StructField("age", IntegerType, false)))
      
      val avroTypeStruct = SchemaConverters.toAvroType(expectedSchema).toString
      
      val avroTypeStr = s"""
       |{
       | "type": "int",
       | "name": "key"
       |}
       """.stripMargin
      
      
      dfKV.select(from_avro('key, avroTypeStr)).show
      dfKV.select(from_avro('value, avroTypeStruct)).show
      
      // output for the last statement and that is not correct
      +---------------------------------------------+
      |from_avro(value, struct<name:string,age:int>)|
      +---------------------------------------------+
      | [Josh Duke, 50]|
      | [Josh Duke, 50]|
      | [Josh Duke, 50]|
      +---------------------------------------------+
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hluu Hien Luu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: