Hide
Pig 0.11+ includes the following UDFs for operating with Map
1. VALUESET
2. VALUELIST
3. KEYSET
4. INVERSEMAP
VALUESET
This UDF takes a Map and returns a Tuple containing the value set.
Note, this UDF returns only unique values. For all values, use
VALUELIST instead.
<code>
grunt> cat data
[open#apache,1#2,11#2]
[apache#hadoop,3#4,12#hadoop]
grunt> a = load 'data' as (M:[]);
grunt> b = foreach a generate VALUELIST($0);
({(apache),(2)})
({(4),(hadoop)})
</code>
VALUELIST
This UDF takes a Map and returns a Bag containing the values from map.
Note that output tuple contains all values, not just unique ones.
For obtaining unique values from map, use VALUESET instead.
<code>
grunt> cat data
[open#apache,1#2,11#2]
[apache#hadoop,3#4,12#hadoop]
grunt> a = load 'data' as (M:[]);
grunt> b = foreach a generate VALUELIST($0);
grunt> dump b;
({(apache),(2),(2)})
({(4),(hadoop),(hadoop)})
</code>
KEYSET
This UDF takes a Map and returns a Bag containing the keyset.
<code>
grunt> cat data
[open#apache,1#2,11#2]
[apache#hadoop,3#4,12#hadoop]
grunt> a = load 'data' as (M:[]);
grunt> b = foreach a generate KEYSET($0);
grunt> dump b;
({(open),(1),(11)})
({(3),(apache),(12)})
</code>
INVERSEMAP
This UDF accepts a Map as input with values of any primitive data type.
UDF swaps keys with values and returns the new inverse Map.
Note in case original values are non-unique, the resulting Map would
contain String Key -> DataBag of values. Here the bag of values is composed
of the original keys having the same value.
Note: 1. UDF accepts Map with Values of primitive data type
2. UDF returns Map<String,DataBag>
<code>
grunt> cat 1data
[open#1,1#2,11#2]
[apache#2,3#4,12#24]
grunt> a = load 'data' as (M:[int]);
grunt> b = foreach a generate INVERSEMAP($0);
grunt> dump b;
([2#{(1),(11)},apache#{(open)}])
([hadoop#{(apache),(12)},4#{(3)}])
</code>
Show
Pig 0.11+ includes the following UDFs for operating with Map
1. VALUESET
2. VALUELIST
3. KEYSET
4. INVERSEMAP
VALUESET
This UDF takes a Map and returns a Tuple containing the value set.
Note, this UDF returns only unique values. For all values, use
VALUELIST instead.
<code>
grunt> cat data
[open#apache,1#2,11#2]
[apache#hadoop,3#4,12#hadoop]
grunt> a = load 'data' as (M:[]);
grunt> b = foreach a generate VALUELIST($0);
({(apache),(2)})
({(4),(hadoop)})
</code>
VALUELIST
This UDF takes a Map and returns a Bag containing the values from map.
Note that output tuple contains all values, not just unique ones.
For obtaining unique values from map, use VALUESET instead.
<code>
grunt> cat data
[open#apache,1#2,11#2]
[apache#hadoop,3#4,12#hadoop]
grunt> a = load 'data' as (M:[]);
grunt> b = foreach a generate VALUELIST($0);
grunt> dump b;
({(apache),(2),(2)})
({(4),(hadoop),(hadoop)})
</code>
KEYSET
This UDF takes a Map and returns a Bag containing the keyset.
<code>
grunt> cat data
[open#apache,1#2,11#2]
[apache#hadoop,3#4,12#hadoop]
grunt> a = load 'data' as (M:[]);
grunt> b = foreach a generate KEYSET($0);
grunt> dump b;
({(open),(1),(11)})
({(3),(apache),(12)})
</code>
INVERSEMAP
This UDF accepts a Map as input with values of any primitive data type.
UDF swaps keys with values and returns the new inverse Map.
Note in case original values are non-unique, the resulting Map would
contain String Key -> DataBag of values. Here the bag of values is composed
of the original keys having the same value.
Note: 1. UDF accepts Map with Values of primitive data type
2. UDF returns Map<String,DataBag>
<code>
grunt> cat 1data
[open#1,1#2,11#2]
[apache#2,3#4,12#24]
grunt> a = load 'data' as (M:[int]);
grunt> b = foreach a generate INVERSEMAP($0);
grunt> dump b;
([2#{(1),(11)},apache#{(open)}])
([hadoop#{(apache),(12)},4#{(3)}])
</code>