[SPARK-22170] Broadcast join holds an extra copy of rows in driver memory - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.0.2, 2.1.1, 2.2.0
Fix Version/s: 2.3.0
Component/s: SQL
Labels:
None

Description

I investigated a driver OOM that was building a large broadcast table with a memory profiler and found that a huge amount of memory is used while building a broadcast table. This is because BroadcastExchangeExec uses executeCollect. In executeCollect, all of the partitions are fetched as compressed blocks, then each block is decompressed (with a stream), and each row is copied to a new byte buffer and added to an ArrayBuffer, which is copied to an Array. This results in a huge amount of allocation: a buffer for each row in the broadcast. Those rows are only used to get copied into a BytesToBytesMap that will be broadcasted, so there is no need to keep them in memory.

Replacing the array buffer step with an iterator reduces the amount of memory held while creating the map by not requiring all rows to be in memory. It also avoids allocating a large Array for the rows. In practice, a 16MB broadcast table used 100MB less memory with this approach, but the reduction depends on the size of rows and compression (16MB was in Parquet format).

Attachments

Issue Links

links to

[Github] Pull Request #19394 (rdblue)

Activity

People

Assignee:: Ryan Blue

Reporter:: Ryan Blue

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 29/Sep/17 23:52

Updated:: 09/Oct/17 22:23

Resolved:: 09/Oct/17 22:23