Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
spark-branch
-
None
Description
Spark engine should set parallelism to be used for CROSS operation by GFCross UDF.
If not set, GFCross throws an exception:
String s = cfg.get(PigImplConstants.PIG_CROSS_PARALLELISM + "." + crossKey); if (s == null) { throw new IOException("Unable to get parallelism hint from job conf"); }
Estimating parallelism for Spark engine is a TBD item. Until that is done, for CROSS to work, we should use the default parallelism value in GFCross.
Attachments
Attachments
Issue Links
- blocks
-
PIG-4189 Make cross join work with Spark
- Resolved
- links to