Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-837

Fix AvroKafkaSource to use the latest schema for reading

    XMLWordPrintableJSON

Details

    Description

      Currently we specify KafkaAvroDeserializer as the value for value.deserializer in AvroKafkaSource. This implies the published record is read using the same schema with which it was written even though the schema got evolved in between. As a result, messages in incoming batch can have different schemas. This has to be handled at the time of actually writing records in parquet. 

      This Jira aims at providing an option to read all the messages with the same schema by implementing a new custom deserializer class. 

      Attachments

        Issue Links

          Activity

            People

              shivnarayan sivabalan narayanan
              Pratyaksh Pratyaksh Sharma
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: