Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-3760

Using enum with default symbol, cannot parse future value

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.11.1
    • 1.13.0
    • python
    • $ pip freeze | grep -i avro
      avro==1.11.1
      $ python --version
      Python 3.8.16
      

    Description

      It seems like support for default symbols is broken. In the example below, since I'm using default symbols, I expected to be able to add new values to the enum and see the default value when parsing using the old schema.

      import io
      from avro.io import DatumReader, DatumWriter, BinaryDecoder, BinaryEncoder
      import avro.schema
      
      current_schema = avro.schema.parse("""
      {
          "fields": [
              {
                  "default": "unknown",
                  "name": "checksum_algorithm",
                  "type": {
                      "name": "ChecksumAlgorithm",
                      "symbols": [
                          "unknown",
                          "xxhash3_64_be"
                      ],
                      "type": "enum",
                      "default": "unknown"
                  }
              }
          ],
          "name": "Metadata",
          "type": "record"
      }
      """)
      
      # Future schema adds the "crc32_be" symbol.
      future_schema = avro.schema.parse("""
      {
          "fields": [
              {
                  "default": "unknown",
                  "name": "checksum_algorithm",
                  "type": {
                      "name": "ChecksumAlgorithm",
                      "symbols": [
                          "unknown",
                          "xxhash3_64_be",
                          "crc32_be"
                      ],
                      "type": "enum",
                      "default": "unknown"
                  }
              }
          ],
          "name": "Metadata",
          "type": "record"
      }
      """)
      
      
      with io.BytesIO() as buffer:
          writer = DatumWriter(future_schema)
          encoder = BinaryEncoder(buffer)
          writer.write({"checksum_algorithm": "crc32_be"}, encoder)
          buffer.seek(0)
      
          reader = DatumReader(current_schema)
          decoder = BinaryDecoder(buffer)
          decoded = reader.read(decoder)
      
      print(decoded)
      

      Instead, this results in an exception:

      Traceback (most recent call last):
        File "reproduce-avro.py", line 58, in <module>
          decoded = reader.read(decoder)
        File "/Users/anton/.pyenv/versions/karapace/lib/python3.8/site-packages/avro/io.py", line 649, in read
          return self.read_data(self.writers_schema, self.readers_schema, decoder)
        File "/Users/anton/.pyenv/versions/karapace/lib/python3.8/site-packages/avro/io.py", line 727, in read_data
          return self.read_record(writers_schema, readers_schema, decoder)
        File "/Users/anton/.pyenv/versions/karapace/lib/python3.8/site-packages/avro/io.py", line 922, in read_record
          field_val = self.read_data(field.type, readers_field.type, decoder)
        File "/Users/anton/.pyenv/versions/karapace/lib/python3.8/site-packages/avro/io.py", line 720, in read_data
          return self.read_enum(writers_schema, readers_schema, decoder)
        File "/Users/anton/.pyenv/versions/karapace/lib/python3.8/site-packages/avro/io.py", line 779, in read_enum
          raise avro.errors.SchemaResolutionException(
      avro.errors.SchemaResolutionException: Can't access enum index 2 for enum with 2 symbols
      Writer's Schema: {
        "type": "enum",
        "default": "unknown",
        "name": "ChecksumAlgorithm",
        "symbols": [
          "unknown",
          "xxhash3_64_be"
        ]
      }
      Reader's Schema: {
        "type": "enum",
        "default": "unknown",
        "name": "ChecksumAlgorithm",
        "symbols": [
          "unknown",
          "xxhash3_64_be"
        ]
      }
      

      Attachments

        Issue Links

          Activity

            People

              antonagestam-aiven Anton Agestam
              antonagestam-aiven Anton Agestam
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m