Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4059 Pig on Spark
  3. PIG-4549

Set CROSS operation parallelism for Spark engine

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • spark-branch
    • spark-branch
    • spark
    • None

    Description

      Spark engine should set parallelism to be used for CROSS operation by GFCross UDF.

      If not set, GFCross throws an exception:

                      String s = cfg.get(PigImplConstants.PIG_CROSS_PARALLELISM + "." + crossKey);
                      if (s == null) {
                          throw new IOException("Unable to get parallelism hint from job conf");
                      }
      

      Estimating parallelism for Spark engine is a TBD item. Until that is done, for CROSS to work, we should use the default parallelism value in GFCross.

      Attachments

        1. PIG-4549.patch
          127 kB
          Mohit Sabharwal
        2. PIG-4549.1.patch
          7 kB
          Mohit Sabharwal
        3. PIG-4549.2.patch
          13 kB
          Mohit Sabharwal

        Issue Links

          Activity

            People

              mohitsabharwal Mohit Sabharwal
              mohitsabharwal Mohit Sabharwal
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: