Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4059 Pig on Spark
  3. PIG-4549

Set CROSS operation parallelism for Spark engine

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: spark-branch
    • Fix Version/s: spark-branch
    • Component/s: spark
    • Labels:
      None

      Description

      Spark engine should set parallelism to be used for CROSS operation by GFCross UDF.

      If not set, GFCross throws an exception:

                      String s = cfg.get(PigImplConstants.PIG_CROSS_PARALLELISM + "." + crossKey);
                      if (s == null) {
                          throw new IOException("Unable to get parallelism hint from job conf");
                      }
      

      Estimating parallelism for Spark engine is a TBD item. Until that is done, for CROSS to work, we should use the default parallelism value in GFCross.

        Attachments

        1. PIG-4549.patch
          127 kB
          Mohit Sabharwal
        2. PIG-4549.1.patch
          7 kB
          Mohit Sabharwal
        3. PIG-4549.2.patch
          13 kB
          Mohit Sabharwal

          Issue Links

            Activity

              People

              • Assignee:
                mohitsabharwal Mohit Sabharwal
                Reporter:
                mohitsabharwal Mohit Sabharwal
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: