Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Moving the discussion to a new jira:
I've implemented group_cat() in a rush, and found something difficult to slove:
1. function group_cat() has a internal order by clause, currently, we can't implement such an aggregation in hive.
2. when the strings will be group concated are too large, in another words, if data skew appears, there is often not enough memory to store such a big result.