Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-1948

java.lang.ClassCastException while using double value from result of a group

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 0.7.0, 0.8.0, 0.9.0
    • 0.8.0
    • impl
    • None

    Description

      I have a fairly simple script (but too many coloumns) which is failing with class cast exception.

      register myudf.jar;
      A = load 'newinput' as (datestamp: chararray,vtestid: chararray,src_kt1: chararray,f1: chararray,f2: chararray,f3: chararray,f4: chararray,f5: chararray,f6: int,ipc: chararray,woeid: long,woeid_place: chararray,f7: chararray,f8: double,woeid_latitude: double,f9: chararray,woeid_town: chararray,woeid_county: chararray,a1: chararray,a2: chararray,woeid_country: chararray,a3: chararray,connection_speed: chararray,isp_name: chararray,isp_domain: chararray,ecnt: int,vcnt: int,ccnt: int,startts: int,duration: int,endts: int,stqust: chararray,startqc: chararray,starts_con: chararray,starts_lng: chararray,startv_pk1: int,startv_pk2: int,startv_pk3: int,startv_pk4: int,startv_pk5: int,lastquerystring: chararray,lastqc: chararray,lasts_con: chararray,lasts_lng: chararray,lastv_pk1: int,lastv_pk2: int,lastv_pk3: int,lastv_pk4: int,lastv_pk5: int,b1: chararray,lastsection: chararray,lastseclink: chararray,lasturl: chararray,path: chararray,pathtype: chararray,firstlastquerymatch: int,log_duration: double,log_duration_sq: double,duration_sq: double);
      
      B = foreach A generate  datestamp,src_kt1,vtestid,stqust,ecnt,vcnt,ccnt,log_duration,duration;
      C = group B by ( datestamp, src_kt1,vtestid, stqust ) parallel 4;
      D = foreach C generate COUNT( B ) as total, MyEval( B.log_duration ) as log_duration_summary;
      store D into 'output';
      
      

      The above script is failing with class cast exception;

      java.lang.ClassCastException: java.lang.Double cannot be cast to java.lang.String
      	at org.apache.pig.data.BinInterSedes.readMap(BinInterSedes.java:193)
      	at org.apache.pig.data.BinInterSedes.readDatum(BinInterSedes.java:280)
      	at org.apache.pig.data.BinInterSedes.readDatum(BinInterSedes.java:251)
      	at org.apache.pig.data.BinInterSedes.readTuple(BinInterSedes.java:111)
      	at org.apache.pig.data.BinInterSedes.readDatum(BinInterSedes.java:270)
      	at org.apache.pig.data.BinInterSedes.readDatum(BinInterSedes.java:251)
      	at org.apache.pig.data.BinInterSedes.addColsToTuple(BinInterSedes.java:555)
      	at org.apache.pig.data.BinSedesTuple.readFields(BinSedesTuple.java:64)
      	at org.apache.pig.impl.io.PigNullableWritable.readFields(PigNullableWritable.java:114)
      	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
      	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
      	at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
      	at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
      	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
      	at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1376)
              .
              .
      

      The problem is happening in the line MyEval( B.log_duration ), here even though log_duration is defined as a double field BinInterSedes is considering it as a map value, TINYMAP to be exact. Hence it is trying to cast the double value into the key identifier, ie a String . This bug exists in 0.9 also.

      Attachments

        Activity

          People

            thejas Thejas Nair
            vivekp Vivek Padmanabhan
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: