Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6757

spark.sql.shuffle.partitions is global, not per connection

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.3.0
    • None
    • SQL
    • None

    Description

      We are trying to use the spark.sql.shuffle.partitions parameter to handle large queries differently from smaller queries. We expected that this parameter would be respected per connection, but it seems to be global.

      For example, in try this in two separate JDBC connections:

      Connection 1:

      SET spark.sql.shuffle.partitions=10;
      SELECT * FROM some_table;
      

      The correct number 10 was used.

      Connection 2:

      SET spark.sql.shuffle.partitions=100;
      SELECT * FROM some_table;
      

      The correct number 100 was used.

      Back to connection 1:

      SELECT * FROM some_table;
      

      We expected the number 10 to be used but 100 is used.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dyross David Ross
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: