Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2590

Add config property to disable incremental collection used in Thrift server

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 1.1.0
    • SQL
    • None

    Description

      SparkSQLOperationManager uses RDD.toLocalIterator to collect the result set one partition at a time. This is useful to avoid OOM when the result is large, but introduces extra job scheduling costs as each partition is collected with a separate job. Users may want to disable this when the result set is expected to be small.

      UPDATE Incremental collection hurts performance because tasks of the last stage of the RDD DAG generated from the SQL query plan are executed sequentially. Thus we decided to disable it by default.

      Attachments

        Issue Links

          Activity

            People

              lian cheng Cheng Lian
              lian cheng Cheng Lian
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: