Description
Pig script which does grouping for 3 columns and flattens as 4 columns works when in principle it should not and maybe fail as a front-end error.
A = load 'groupcardinalitycheck.txt' using PigStorage() as (col1:chararray, col2:chararray, col3:int, col4:chararray); B = group A by (col1, col2, col3); C = foreach B generate flatten(group) as (col1, col2, col3, col4), SIZE(A) as frequency; dump C;
==========================================================================================
Data
==========================================================================================
hello CC 1 there
hello YSO 2 out
ouch CC 2 hey
==========================================================================================
Result of the preceding script
==========================================================================================
(ouch,CC,2,1L)
(hello,CC,1,1L)
(hello,YSO,2,1L)
==========================================================================================