[SPARK-24985] Executing SQL with "Full Outer Join" on top of large tables when there is data skew met OOM - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: In Progress
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.3.1
Fix Version/s: None
Component/s: SQL
Labels:
None

Description

When we run SQL with "Full Outer Join" on large tables when there is data skew, we found it's quite easy to hit OOM. We once thought we hit https://issues.apache.org/jira/browse/SPARK-13450. But taking a look at fix in https://github.com/apache/spark/pull/16909, we found that PR hasn't handled the "Full Outer Join" case.

The root cause of the OOM is there are a lot of rows with the same key.

See below code:

private def findMatchingRows(matchingKey: InternalRow): Unit = {
  leftMatches.clear()
  rightMatches.clear()
  leftIndex = 0
  rightIndex = 0
  while (leftRowKey != null && keyOrdering.compare(leftRowKey, matchingKey) == 0)    {
  leftMatches += leftRow.copy()
  advancedLeft()
}
  while (rightRowKey != null && keyOrdering.compare(rightRowKey, matchingKey) == 0) {
     rightMatches += rightRow.copy()
     advancedRight()
}

It seems we haven't limited the data added to leftMatches and rightMatches.

Attachments

Issue Links

links to

[Github] Pull Request #22168 (sujithjay)

[Github] Pull Request #29071 (sidedoorleftroad)

GitHub Pull Request #22168

Activity

People

Assignee:: Unassigned

Reporter:: sheperd huang

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 01/Aug/18 01:54

Updated:: 11/Jul/20 02:55