Description
I have lots of queries using GROUPING() function. failing on hive , just because GROUPING() not supported in hive. See the Query below;
SELECT fact_1_id,
fact_2_id,
GROUPING(fact_1_id) AS f1g,
GROUPING(fact_2_id) AS f2g
FROM dimension_tab
GROUP BY CUBE (fact_1_id, fact_2_id)
ORDER BY fact_1_id, fact_2_id;
In order to run in HIVE all such queries, It need to be transformed to HIVE syntax. See below transformed query, compatible to hive. Equivalent have been derived using Case statement .
SELECT fact_1_id,
fact_2_id,
(case when (GROUPING__ID & 1) = 0 then 1 else 0 end) as f1g,
(case when (GROUPING__ID & 2) = 0 then 1 else 0 end) as f2g
FROM dimension_tab
GROUP BY fact_1_id, fact_2_id WITH CUBE
ORDER BY fact_1_id, fact_2_id;
It would be great if GROUPING() implemented in hive. I see two ways to do it
1) Handle it at parser level.
2) GROUPING() aggregate function to hive(recommended)