Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-14520

ClasscastException thrown with spark.sql.parquet.enableVectorizedReader=true

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0
    • SQL
    • None

    Description

      Build details: Spark build from master branch (Apr-10)

      TPC-DS at 200 GB scale stored in Parq format stored in hive.

      Ran TPC-DS Query27 via Spark beeline client with "spark.sql.sources.fileScan=false".

       java.lang.ClassCastException: org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader cannot be cast to org.apache.parquet.hadoop.ParquetRecordReader
              at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetInputFormat.createRecordReader(ParquetRelation.scala:480)
              at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetInputFormat.createRecordReader(ParquetRelation.scala:476)
              at org.apache.spark.rdd.SqlNewHadoopRDD$$anon$1.<init>(SqlNewHadoopRDD.scala:161)
              at org.apache.spark.rdd.SqlNewHadoopRDD.compute(SqlNewHadoopRDD.scala:121)
              at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:318)
              at org.apache.spark.rdd.RDD.iterator(RDD.scala:282)
              at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
              at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:318)
              at org.apache.spark.rdd.RDD.iterator(RDD.scala:282)
              at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
              at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:318)
              at org.apache.spark.rdd.RDD.iterator(RDD.scala:282)
              at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:69)
              at org.apache.spark.scheduler.Task.run(Task.scala:82)
              at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:231)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      

      Creating this JIRA as a placeholder to track this issue.

      Attachments

        Activity

          People

            viirya L. C. Hsieh
            rajesh.balamohan Rajesh Balamohan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: