Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23230

When hive.default.fileformat is other kinds of file types, create textfile table cause a serde error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.1.2, 2.2.0, 2.2.1
    • 2.2.2, 2.3.0
    • SQL
    • None

    Description

      When hive.default.fileformat is other kinds of file types, create textfile table cause a serde error.
      We should take the default type of textfile and sequencefile both as org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.

      set hive.default.fileformat=orc;
      create table tbl( i string ) stored as textfile;
      desc formatted tbl;
      
      Serde Library org.apache.hadoop.hive.ql.io.orc.OrcSerde
      InputFormat  org.apache.hadoop.mapred.TextInputFormat
      OutputFormat  org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

       

      set hive.default.fileformat=orc;
      create table tbl stored as textfile
      as
      select  1
      
      
      
      

      It failed because it used the wrong SERDE

      Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.io.orc.OrcSerde$OrcSerdeRow cannot be cast to org.apache.hadoop.io.BytesWritable
      	at org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat$1.write(HiveIgnoreKeyTextOutputFormat.java:91)
      	at org.apache.spark.sql.hive.execution.HiveOutputWriter.write(HiveFileFormat.scala:149)
      	at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:327)
      	at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
      	at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
      	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
      	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
      	... 16 more
      

       

      Attachments

        Activity

          People

            dzcxzl dzcxzl
            dzcxzl dzcxzl
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: