Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7277

property mapred.reduce.task replaced by spark.sql.shuffle.partitions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.3.1
    • 1.4.0
    • SQL
    • None

    Description

      When I use "SET mapred.reduce.task" I get the warning "SetCommand: Property mapred.reduce.tasks is deprecated, automatically converted to spark.sql.shuffle.partitions instead."

      It's true that mapred.reduce.task is deprecated but this replacement causes serious trouble:

      Setting mapred.reduce.task to -1 (negative one) is valid and causes hadoop/hive to automatically determine the required number of reducers.

      Setting spark.sql.shuffle.partitions to negative can cause spark to produce incorrect results.
      In my system (spark-sql 1.3.1 running on a single machine in "local" mode), with this setting, any outer join produces no output (whereas an inner join does)

      Attachments

        Activity

          People

            viirya L. C. Hsieh
            tannin Sebastian
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: