There are 2 issues with the realtime input format:
- Delta records (updates) might not have the entire row change log, in such an update, we need to be able to call preCombine of the HoodieRecordPayload implementation so that we merge existing data from parquet (full row change log) with the new column being updated.
- In case there is some custom computation of columns in a custom implementation of the HoodieRecordPayload, that will be missed in the realtime input format right now. We need to honor that by calling preCombine.
Both of the above are use-cases for power users who implement their own custom record. Since this is not common, this is lower priority.