Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-3229

Python Avro doesn't validate the default value of an enum field

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Trivial
    • Resolution: Fixed
    • 1.10.2
    • 1.11.1
    • python
    • python --version
      Python 3.9.5

      pip freeze | grep avro
      avro==1.10.2

    Description

      The following schema is invalid for Java (it fails to compile), because the default value is not a valid symbol:

       

      {
        "type": "record",
        "name": "test_schema",
        "fields": [
          {
            "name": "test_enum",
            "type": {
              "name": "test_enum_type",
              "type": "enum",
              "symbols": [
                "NONE"
              ],
              "default": "UNKNOWN"
            }
          }
        ]
      }
      

      This matches the behavior documented in the spec:

       

      default: A default value for this enumeration, used during resolution when the reader encounters a symbol from the writer that isn't defined in the reader's schema (optional). The value provided here must be a JSON string that's a member of the symbols array. See documentation on schema resolution for how this gets used.

      But the same schema is silently accepted by the python library (although the writer doesn't allow the invalid value to be produced):

      import avro.schema
      from avro.datafile import DataFileReader, DataFileWriter
      from avro.io import DatumReader, DatumWriter
      
      with open("test.avsc", "rb") as handler:
          schema = avro.schema.parse(handler.read())
      
      DATA_FILE = "test.avro"
      
      with open(DATA_FILE, "wb") as handler:
          writer = DataFileWriter(handler, DatumWriter(), schema)
          writer.append({"test_enum": "NONE"})
          # writer.append({"test_enum": "UNKNOWN"})
          # writer.append({})
          writer.close()
      
      with open(DATA_FILE, "rb") as handler:
          for user in DataFileReader(handler, DatumReader()):
              print(user)
      

      Attachments

        Activity

          People

            kojiromike Michael A. Smith
            hack.augusto Augusto Hack
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h
                1h