[SPARK-2590] Add config property to disable incremental collection used in Thrift server - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.1.0
Component/s: SQL
Labels:
None

Target Version/s:

1.1.0

Description

SparkSQLOperationManager uses RDD.toLocalIterator to collect the result set one partition at a time. This is useful to avoid OOM when the result is large, but introduces extra job scheduling costs as each partition is collected with a separate job. Users may want to disable this when the result set is expected to be small.

UPDATE Incremental collection hurts performance because tasks of the last stage of the RDD DAG generated from the SQL query plan are executed sequentially. Thus we decided to disable it by default.

Attachments

Issue Links

is duplicated by

SPARK-2591 Add config property to disable incremental collection used in Thrift server

Resolved

links to

[Github] Pull Request #1853 (liancheng)

Activity

People

Assignee:: Cheng Lian

Reporter:: Cheng Lian

Votes:: 1 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 19/Jul/14 06:01

Updated:: 12/Aug/14 03:08

Resolved:: 12/Aug/14 03:08