Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-48718

Got incastable error when deserializer in cogroup is resolved during application of DeduplicateRelation rule

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.0.0
    • 4.0.0
    • SQL
    • None

    Description

      When running the following commands:

       

       

          val lhs = spark.createDataFrame(
            List(Row(123L)).asJava,
            StructType(Seq(StructField("GROUPING_KEY", LongType)))
          )
          val rhs = spark.createDataFrame(
            List(Row(0L, 123L)).asJava,
            StructType(Seq(StructField("ID", LongType), StructField("GROUPING_KEY", LongType)))
          )    val lhsKV = lhs.groupByKey((r: Row) => r.getAs[Long]("GROUPING_KEY"))
          val rhsKV = rhs.groupByKey((r: Row) => r.getAs[Long]("GROUPING_KEY"))
          val cogrouped = lhsKV.cogroup(rhsKV)(
            (a: Long, b: Iterator[Row], c: Iterator[Row]) => Iterator(0L)
          )
          val joined = rhs.join(cogrouped, col("ID") === col("value"), "left") 

       

       

      It gets an error:

      java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.objects.AssertNotNull cannot be cast to org.apache.spark.sql.catalyst.analysis.UnresolvedDeserializer 

      Attachments

        Activity

          People

            xyyu Xinyi Yu
            xyyu Xinyi Yu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: