Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1716

rt view w/ MOR tables fails after schema evolution

    XMLWordPrintableJSON

    Details

      Description

      Looks like realtime view w/ MOR table fails if schema present in existing log file is evolved to add a new field. no issues w/ writing. but reading fails

      More info: https://github.com/apache/hudi/issues/2675

       

      gist of the stack trace:

      Caused by: org.apache.avro.AvroTypeException: Found hoodie.hudi_trips_cow.hudi_trips_cow_record, expecting hoodie.hudi_trips_cow.hudi_trips_cow_record, missing required field evolvedFieldCaused by: org.apache.avro.AvroTypeException: Found hoodie.hudi_trips_cow.hudi_trips_cow_record, expecting hoodie.hudi_trips_cow.hudi_trips_cow_record, missing required field evolvedField at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292) at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:130) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:215) at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145) at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock.deserializeRecords(HoodieAvroDataBlock.java:165) at org.apache.hudi.common.table.log.block.HoodieDataBlock.createRecordsFromContentBytes(HoodieDataBlock.java:128) at org.apache.hudi.common.table.log.block.HoodieDataBlock.getRecords(HoodieDataBlock.java:106) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processDataBlock(AbstractHoodieLogRecordScanner.java:289) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processQueuedBlocksForInstant(AbstractHoodieLogRecordScanner.java:324) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.scan(AbstractHoodieLogRecordScanner.java:252) ... 24 more21/03/25 11:27:03 WARN TaskSetManager: Lost task 0.0 in stage 83.0 (TID 667, sivabala-c02xg219jgh6.attlocal.net, executor driver): org.apache.hudi.exception.HoodieException: Exception when reading log file  at org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.scan(AbstractHoodieLogRecordScanner.java:261) at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:100) at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:93) at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:75) at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:230) at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:328) at org.apache.hudi.HoodieMergeOnReadRDD$$anon$3.<init>(HoodieMergeOnReadRDD.scala:210) at org.apache.hudi.HoodieMergeOnReadRDD.payloadCombineFileIterator(HoodieMergeOnReadRDD.scala:200) at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:77)

       

      Logs from local run: 

      https://gist.github.com/nsivabalan/656956ab313676617d84002ef8942198

      diff with which above logs were generated: https://gist.github.com/nsivabalan/84dad29bc1ab567ebb6ee8c63b3969ec

       

      Steps to reproduce in spark shell:

      1. create MOR table w/ schema1. 
      2. Ingest (with schema1) until log files are created. // verify via hudi-cli. It took me 2 batch of updates to see a log file.
      3. create a new schema2 with one new additional field. ingest a batch with schema2 that updates existing records. 
      4. read entire dataset. 

       

       

       

        Attachments

          Activity

            People

            • Assignee:
              aditiwari Aditya Tiwari
              Reporter:
              shivnarayan sivabalan narayanan
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: