Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21861

ClassCastException during CTAS over external table using KafkaStorageHandler

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • kafka integration
    • None

    Description

      To reproduce, create a table similar to the following:

       CREATE EXTERNAL TABLE <table>
       (raw_value STRING)
      ROW FORMAT DELIMITED
      LINES TERMINATED BY '\n'
      STORED BY 'org.apache.hadoop.hive.kafka.KafkaStorageHandler'
      TBLPROPERTIES(
       "kafka.topic"="<kafka_topic>",
       "kafka.bootstrap.servers"="<bootstrap_servers>",
       "kafka.consumer.security.protocol"="PLAINTEXT",
       "kafka.serde.class"="org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe");
      

      Note the SerDe isn't the default SerDe. Additionally, this error occurs when vectorization is enabled.

      Basic queries work fine:

      SELECT * FROM <table> LIMIT 1;
      

      Doing a CTAS to bring it into a managed table fails:

      CREATE TABLE <managed_table> AS
      SELECT * FROM <table>;
      

      The exception is:

      Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazy.LazyString cannot be cast to org.apache.hadoop.io.TextCaused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazy.LazyString cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:471) at org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) at org.apache.hadoop.hive.kafka.VectorizedKafkaRecordReader.readNextBatch(VectorizedKafkaRecordReader.java:159) at org.apache.hadoop.hive.kafka.VectorizedKafkaRecordReader.next(VectorizedKafkaRecordReader.java:113) at org.apache.hadoop.hive.kafka.VectorizedKafkaRecordReader.next(VectorizedKafkaRecordReader.java:47) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ... 24 more
      

      A workaround to this is to disable vectorization via:

      set hive.vectorized.execution.enabled = false;
      

      Attachments

        1. HIVE-21861.patch
          2 kB
          Rajkumar Singh

        Activity

          People

            Rajkumar Singh Rajkumar Singh
            justinleet Justin Leet
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: