Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Not A Problem
-
3.0.0
-
None
-
None
Description
Currently, columnPartition in JDBCRelation contains logic that adds the first stride into the lower partition. Because of this, the lower bound isn't used as the ceiling for the lower partition.
For example, say we have data 0-10, 10 partitions, and the lowerBound is set to 1. The lower/first partition should contain anything < 1. However, in the current implementation, it would include anything < 2.
A possible easy fix would be changing the following code on line 132:
currentValue += stride
To:
if (i != 0) currentValue += stride
Or include currentValue += stride within the if statement on line 131... although this creates a pretty bad looking side-effect.
Attachments
Issue Links
- is related to
-
SPARK-34843 JDBCRelation columnPartition function improperly determines stride size. Upper bound is skewed due to stride alignment.
- Resolved