Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.12.0, 0.11.1
-
None
-
Reviewed
-
UDF now supports LoadCaster. Default behavior is to check if all parameters belong to same loadcaster and use it when true.
Description
this ticket was very close to http://stackoverflow.com/questions/8828839/how-can-correct-data-types-on-apache-pig-be-enforced.
To reproduce the issue, first, we have an UDF to cast map to bag, code almost like(http://stackoverflow.com/questions/12476929/group-key-value-of-map-in-pig?answertab=votes#tab-top)
test.pig
$ cat test.pig register polisan/maptobag.jar; define MAPTOBAG maptobag.MAPTOBAG(); A = load 'polisan/input1.txt' using PigStorage(' ') as (id:chararray, kv:[]); B = foreach A generate id, MAPTOBAG(kv) as to_bag; C = foreach B generate id, flatten(to_bag) as (key:chararray, value:chararray); D = group C by (id, key); E = foreach D generate group, MIN(C.value); dump E;
polisan/input1.pig
1 [x#1,y#ab] 1 [x#2,y#cd]
then run the pig, I got exception as following:
2014-05-15 19:44:52,944 [Thread-2] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local_0001 org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: D: Local Rearrange[tuple]{tuple}(false) - scope-42 Operator Key: scope-42): org.apache.pig.backend.executionengine.ExecException: ERROR 2106: Error while computing min in Initial at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:289) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:263) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:1) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2106: Error while computing min in Initial at org.apache.pig.builtin.StringMin$Initial.exec(StringMin.java:81) at org.apache.pig.builtin.StringMin$Initial.exec(StringMin.java:1) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:352) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextTuple(POUserFunc.java:391) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:281) ... 8 more Caused by: java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be cast to java.lang.String at org.apache.pig.builtin.StringMin$Initial.exec(StringMin.java:73) ... 15 more
Attachments
Attachments
Issue Links
- is related to
-
PIG-4974 A simple map reference fail to cast
- Resolved