Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
In current implementation of FRJoin(PIG-4771), we just set the value of replication of data as 10 to make the data access more efficiency because current FRJoin algrithms can be reused in this way. We need to figure out how to use broadcasting small rdd to implement FRJoin in current code base if we find the performance can be improved a lot by using broadcasting rdd.