-
Type:
Sub-task
-
Status: Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: spark-branch
-
Component/s: spark
-
Labels:None
Currently even if pig.schematuple is set to false which is the default, the usage of TupleToMapKey and TuplesToSchemaTupleList instead of plain HashMap<Object, ArrayList<Tuple>> costs a lot of memory. Also key is currently converted to a tuple which is unnecessary. Detail see PIG-4874