Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-25962

Flink generated Avro schemas can't be parsed using Python

    XMLWordPrintableJSON

Details

    • Avro schemas generated by Flink now use the "org.apache.flink.avro.generated" namespace for compatibility with the Avro Python SDK.

    Description

      Flink currently generates Avro schemas as records with the top-level name "record"

      Unfortunately, there is some inconsistency between Avro implementations in different languages that may prevent this record from being read, notably Python, which generates the error:
      avro.schema.SchemaParseException: record is a reserved type name
      (See the comment on FLINK-18096 for the full stack trace).

      The Java SDK accepts this name, and there's an ongoing discussion about what the expected behaviour should be.  This should be clarified and fixed in Avro, of course.

      Regardless of the resolution, the best practice (which is used almost everywhere else in the Flink codebase) is to explicitly specify a top-level namespace for an Avro record.   We should use a default like: org.apache.flink.avro.generated.

      Attachments

        Issue Links

          Activity

            People

              rskraba Ryan Skraba
              rskraba Ryan Skraba
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: