Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24676

Project required data from parsed data when csvColumnPruning disabled

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.3.1
    • 2.4.0
    • SQL
    • None

    Description

      I hit a bug below when parsing csv data;

      ./bin/spark-shell --conf spark.sql.csv.parser.columnPruning.enabled=false
      scala> val dir = "/tmp/spark-csv/csv"
      scala> spark.range(10).selectExpr("id % 2 AS p", "id").write.mode("overwrite").partitionBy("p").csv(dir)
      scala> spark.read.csv(dir).selectExpr("sum(p)").collect()
      18/06/25 13:48:46 ERROR Executor: Exception in task 2.0 in stage 2.0 (TID 7)
      java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer
              at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101)
              at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getInt(rows.scala:41)
              ...
      

      Attachments

        Activity

          People

            maropu Takeshi Yamamuro
            maropu Takeshi Yamamuro
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: