Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-152

Invoke preCombine in real time view by converting arrayWritable to Avro

    XMLWordPrintableJSON

    Details

      Description

      There are 2 issues with the realtime input format:

       

      1. Delta records (updates) might not have the entire row change log, in such an update, we need to be able to call preCombine of the HoodieRecordPayload implementation so that we merge existing data from parquet (full row change log) with the new column being updated.
      2. In case there is some custom computation of columns in a custom implementation of the HoodieRecordPayload, that will be missed in the realtime input format right now. We need to honor that by calling preCombine.

       

      Both of the above are use-cases for power users who implement their own custom record. Since this is not common, this is lower priority. 

        Attachments

          Activity

            People

            • Assignee:
              nishith29 Nishith Agarwal
              Reporter:
              nishith29 Nishith Agarwal
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: