Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-2062

Hive streaming doesn't support non string partition fields

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.0.1
    • None
    • storm-hive
    • None

    Description

      create hive table with an int partition column

      CREATE TABLE CDRDWH.CDR_FACT (
      geo_id int,
      time_id smallint,
      cust_id smallint,
      vend_id smallint,
      cust_rel_id smallint,
      vend_rel_id smallint,
      route tinyint,
      connect boolean,
      earlyEvent boolean,
      Call_duration_cust double,
      I_PDD double,
      E_PDD double,
      orig_number string,
      term_number string
      )
      partitioned by (date_id int)
      clustered by (geo_id, time_id) into 16 buckets
      stored as ORC
      tblproperties ("orc.compress"="SNAPPYā€¯);

      When i try to stream my topolgy output to Hive I get the following exception:

      11829 [Thread-31-hivewriter-executor[5 5]] ERROR o.a.s.d.executor -
      java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
      at org.apache.storm.tuple.TupleImpl.getStringByField(TupleImpl.java:153) ~[storm-core-1.0.1.jar:1.0.1]
      at org.apache.storm.hive.bolt.mapper.DelimitedRecordHiveMapper.mapPartitions(DelimitedRecordHiveMapper.java:92) ~[storm-hive-1.0.1.jar:1.0.1]
      at org.apache.storm.hive.bolt.HiveBolt.execute(HiveBolt.java:112) [storm-hive-1.0.1.jar:1.0.1]
      at org.apache.storm.daemon.executor$fn_7953$tuple_action_fn_7955.invoke(executor.clj:728) [storm-core-1.0.1.jar:1.0.1]
      at org.apache.storm.daemon.executor$mk_task_receiver$fn__7874.invoke(executor.clj:461) [storm-core-1.0.1.jar:1.0.1]
      at org.apache.storm.disruptor$clojure_handler$reify__7390.onEvent(disruptor.clj:40) [storm-core-1.0.1.jar:1.0.1]
      at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:439) [storm-core-1.0.1.jar:1.0.1]
      at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:418) [storm-core-1.0.1.jar:1.0.1]
      at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) [storm-core-1.0.1.jar:1.0.1]
      at org.apache.storm.daemon.executor$fn_7953$fn7966$fn_8019.invoke(executor.clj:847) [storm-core-1.0.1.jar:1.0.1]
      at org.apache.storm.util$async_loop$fn__625.invoke(util.clj:484) [storm-core-1.0.1.jar:1.0.1]
      at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
      at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]

      Line 92 of DelimtedRecordHiveMapper is attempting to access my integer field as a String and the subsequent exception is thrown
      @Override
      public List<String> mapPartitions(Tuple tuple) {
      List<String> partitionList = new ArrayList<String>();
      if(this.partitionFields != null) {
      for(String field: this.partitionFields)

      { partitionList.add(tuple.getStringByField(field)); }

      }
      if (this.timeFormat != null)

      { partitionList.add(getPartitionsByTimeFormat()); }

      return partitionList;
      }

      Attachments

        Activity

          People

            Unassigned Unassigned
            ryan_templeton Ryan Templeton
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: