Details
-
Bug
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
2.3.0, 3.0.0
-
None
-
None
Description
else { // numPartitionFields = -1 means random partitioning partitionCols.add(TypeCheckProcFactory.DefaultExprProcessor.getFuncExprNodeDesc("rand")); }
This causes known data corruption during failure tolerance operations.
There is a failure tolerant distribution function inside ReduceSinkOperator, which kicks in automatically when using no partition columns
if (partitionEval.length == 0) { // If no partition cols, just distribute the data uniformly // to provide better load balance. If the requirement is to have a single reducer, we should // set the number of reducers to 1. Use a constant seed to make the code deterministic. if (random == null) { random = new Random(12345); } keyHashCode = random.nextInt(); }
Attachments
Attachments
Issue Links
- causes
-
KYLIN-3388 Data may become not correct if mappers fail during the redistribute step, "distribute by rand()"
- Closed