[SPARK-2468] Netty-based block server / client module - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.2.0
Component/s: Shuffle, Spark Core
Labels:
None

Target Version/s:

1.2.0

Description

Right now shuffle send goes through the block manager. This is inefficient because it requires loading a block from disk into a kernel buffer, then into a user space buffer, and then back to a kernel send buffer before it reaches the NIC. It does multiple copies of the data and context switching between kernel/user. It also creates unnecessary buffer in the JVM that increases GC

Instead, we should use FileChannel.transferTo, which handles this in the kernel space with zero-copy. See http://www.ibm.com/developerworks/library/j-zerocopy/

One potential solution is to use Netty. Spark already has a Netty based network module implemented (org.apache.spark.network.netty). However, it lacks some functionality and is turned off by default.

Attachments

Issue Links

is related to

SPARK-3019 Pluggable block transfer (data plane communication) interface

Resolved

SPARK-3385 Improve shuffle performance

Resolved

links to

[Github] Pull Request #1907 (rxin)

[Github] Pull Request #1971 (rxin)

Sub-Tasks

There are no Sub-Tasks for this issue.

Activity

People

Assignee:: Reynold Xin

Reporter:: Reynold Xin

Votes:: 0 Vote for this issue

Watchers:: 23 Start watching this issue

Dates

Created:: 14/Jul/14 06:39

Updated:: 26/Nov/14 09:46

Resolved:: 05/Nov/14 08:27