[SPARK-8893] Require positive partition counts in RDD.repartition - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Trivial
Resolution: Fixed
Affects Version/s: 1.4.0
Fix Version/s: 1.5.0
Component/s: Spark Core
Labels:
None

Description

What does sc.parallelize(1 to 3).repartition(p).collect return? I would expect Array(1, 2, 3) regardless of p. But if p < 1, it returns Array(). I think instead it should throw an IllegalArgumentException.

I think the case is pretty clear for p < 0. But the behavior for p = 0 is also error prone. In fact that's how I found this strange behavior. I used rdd.repartition(a/b) with positive a and b, but a/b was rounded down to zero and the results surprised me. I'd prefer an exception instead of unexpected (corrupt) results.

I'm happy to send a pull request for this.

Attachments

Issue Links

is related to

SPARK-15512 repartition(0) should raise IllegalArgumentException.

Resolved

links to

[Github] Pull Request #7285 (darabos)

Activity

People

Assignee:: Daniel Darabos

Reporter:: Daniel Darabos

Votes:: 1 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 08/Jul/15 10:36

Updated:: 24/May/16 21:53

Resolved:: 16/Jul/15 07:16