Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17922

ClassCastException java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator cannot be cast to org.apache.spark.sql.catalyst.expressions.UnsafeProjection

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.0.0, 2.0.1
    • None
    • SQL

    Description

      I am using spark 2.0
      Seeing class loading issue because the whole stage code gen is generating multiple classes with same name as "org.apache.spark.sql.catalyst.expressions.GeneratedClass"
      I am using dataframe transform. and within transform i use Osgi.
      Osgi replaces the thread context class loader to ContextFinder which looks at all the class loaders in the stack to find out the new generated class and finds the GeneratedClass with inner class GeneratedIterator byteclass loader(instead of falling back to the byte class loader created by janino compiler), since the class name is same that byte class loader loads the class and returns GeneratedClass$GeneratedIterator instead of expected GeneratedClass$UnsafeProjection.

      Can we generate different classes with different names or is it expected to generate one class only?
      This is the somewhat I am trying to do

       
      import org.apache.spark.sql._
      import org.apache.spark.sql.types._
      import com.databricks.spark.avro._
      
        def exePart(out:StructType): ((Iterator[Row]) => Iterator[Row]) = {
      //Initialize osgi
           (rows:Iterator[Row]) => {
               var outi = Iterator[Row]() 
               while(rows.hasNext) {    
                   val r = rows.next         
                   outi = outi.++(Iterator(Row(r.get(0))))          
               } 
               //val ors = Row("abc")               
               //outi =outi.++( Iterator(ors))  
               outi
           }
        }
      
      def transform1( outType:StructType) :((DataFrame) => DataFrame) = {
           (d:DataFrame) => {
            val inType = d.schema
            val rdd = d.rdd.mapPartitions(exePart(outType))
            d.sqlContext.createDataFrame(rdd, outType)
          }
         
        }
      
      val df = spark.read.avro("file:///data/builds/a1.avro")
      val df1 = df.select($"id2").filter(false)
      val df2 = df1.transform(transform1(StructType(StructField("p1", IntegerType, true)::Nil))).createOrReplaceTempView("tbl0")
      
      spark.sql("insert overwrite table testtable select p1 from tbl0")
      

      Attachments

        1. spark_17922.tar.gz
          3.89 MB
          kanika dhuria

        Activity

          People

            Unassigned Unassigned
            kdhuria kanika dhuria
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: