Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
When HoodieMergeOnReadRDD read record from base file, will create new InternalRow base on requiredStructSchema.
//代码占位符 private def createRowWithRequiredSchema(row: InternalRow): InternalRow = { val rowToReturn = new SpecificInternalRow(tableState.requiredStructSchema) val posIterator = requiredFieldPosition.iterator var curIndex = 0 tableState.requiredStructSchema.foreach( f => { val curPos = posIterator.next() val curField = row.get(curPos, f.dataType) rowToReturn.update(curIndex, curField) curIndex = curIndex + 1 } ) rowToReturn }
Hoodie doesn't check isNull when get value from all fields here.
If vectorization is enabled, which means row is ColumnarBatchRow. **ColumnarBatchRow may return non-null value even if value of field is null. So, hoodie may set non-null value in field which is null.