Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10087

Disable reducer locality in 1.5

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.5.0
    • 1.5.0
    • Scheduler, Spark Core
    • None

    Description

      In some cases, when spark.shuffle.reduceLocality.enabled is enabled, we are scheduling all reducers to the same executor (the cluster has plenty of resources). Changing spark.shuffle.reduceLocality.enabled to false resolve the problem.

      Comments of https://github.com/apache/spark/pull/8280 provide more details of the symptom of this issue.

      The query I was using is

      select
        i_brand_id,
        i_brand,
        i_manufact_id,
        i_manufact,
        sum(ss_ext_sales_price) ext_price
      from
        store_sales
        join item on (store_sales.ss_item_sk = item.i_item_sk)
        join customer on (store_sales.ss_customer_sk = customer.c_customer_sk)
        join customer_address on (customer.c_current_addr_sk = customer_address.ca_address_sk)
        join store on (store_sales.ss_store_sk = store.s_store_sk)
        join date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
      where
        --ss_date between '1999-11-01' and '1999-11-30'
        ss_sold_date_sk between 2451484 and 2451513
        and d_moy = 11
        and d_year = 1999
        and i_manager_id = 7
        and substr(ca_zip, 1, 5) <> substr(s_zip, 1, 5)
      group by
        i_brand,
        i_brand_id,
        i_manufact_id,
        i_manufact
      order by
        ext_price desc,
        i_brand,
        i_brand_id,
        i_manufact_id,
        i_manufact
      limit 100
      

      The dataset is tpc-ds scale factor 1500. To reproduce the problem, you can just join store_sales with customer and make sure there is only one mapper reads the data of customer.

      Attachments

        Activity

          People

            yhuai Yin Huai
            yhuai Yin Huai
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: