Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-1343

Python: validate too permissive on records with extra fields

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • python
    • None

    Description

      Python's validator silently accepts (generic) records with extra fields and considers them valid.

      For example, io.validate silently considers that the schema:

      {"type": "record",
       "name": "Test",
       "fields": [{"name": "f", "type": "long"}]}
      

      should accept records like:

      {'f': 5, 'extra_field': "abc"}

      but this is problematic.

      This is especially problematic for encoding unions, because internally the Python serializer uses validate to find the appropriate schema with which to encode a given object.

      In the current implementation, union schema selection is the last schema that validate(schema, obj) returns True for. If validate isn't picky, this encoding will frequently guess wrong.

      I will attach two patches: one to the tests and one to the validate function.

      Attachments

        1. AVRO-1343-validate.patch
          1 kB
          Jeremy Kahn
        2. AVRO-1343-tests.patch
          2 kB
          Jeremy Kahn

        Issue Links

          Activity

            People

              trochee Jeremy Kahn
              trochee Jeremy Kahn
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: