Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4059 Pig on Spark
  3. PIG-4549

Set CROSS operation parallelism for Spark engine

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: spark-branch
    • Fix Version/s: spark-branch
    • Component/s: spark
    • Labels:
      None

      Description

      Spark engine should set parallelism to be used for CROSS operation by GFCross UDF.

      If not set, GFCross throws an exception:

                      String s = cfg.get(PigImplConstants.PIG_CROSS_PARALLELISM + "." + crossKey);
                      if (s == null) {
                          throw new IOException("Unable to get parallelism hint from job conf");
                      }
      

      Estimating parallelism for Spark engine is a TBD item. Until that is done, for CROSS to work, we should use the default parallelism value in GFCross.

        Attachments

        1. PIG-4549.patch
          127 kB
          Mohit Sabharwal
        2. PIG-4549.1.patch
          7 kB
          Mohit Sabharwal
        3. PIG-4549.2.patch
          13 kB
          Mohit Sabharwal

        Issue Links

          Activity

            People

            • Assignee:
              mohitsabharwal Mohit Sabharwal
              Reporter:
              mohitsabharwal Mohit Sabharwal

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment