[PIG-5192] Remove schema tuple reference overhead for replicate join hashmap in POFRJoinSpark - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: spark-branch
Component/s: spark
Labels:
None

Description

Currently even if pig.schematuple is set to false which is the default, the usage of TupleToMapKey and TuplesToSchemaTupleList instead of plain HashMap<Object, ArrayList<Tuple>> costs a lot of memory. Also key is currently converted to a tuple which is unnecessary. Detail see ~~PIG-4874~~

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: liyunzhang

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 16/Mar/17 21:50

Updated:: 16/Mar/17 21:51