Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4856 Optimization for pig on spark
  3. PIG-5068

Set SPARK_REDUCERS by pig.properties not by system configuration

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • spark-branch
    • spark
    • None

    Description

      In SparkUtil.java, we set the SPARK_REDUCERS by system configuration

          public static int getParallelism(List<RDD<Tuple>> predecessors,
                  PhysicalOperator physicalOperator) {
      
              String numReducers = System.getenv("SPARK_REDUCERS");
              if (numReducers != null) {
                  return Integer.parseInt(numReducers);
              }
      
              int parallelism = physicalOperator.getRequestedParallelism();
              if (parallelism <= 0) {
                  // Parallelism wasn't set in Pig, so set it to whatever Spark thinks
                  // is reasonable.
                  parallelism = predecessors.get(0).context().defaultParallelism();
              }
      
              return parallelism;
          }
      

      It is better to set it by pig.properties

      Attachments

        1. PIG-5068_1.patch
          3 kB
          liyunzhang
        2. PIG-5068_2.patch
          3 kB
          liyunzhang
        3. PIG-5068.patch
          4 kB
          liyunzhang

        Activity

          People

            kellyzly liyunzhang
            kellyzly liyunzhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: