Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.8.1
-
None
-
None
Description
The Python DatumWriter seems to evaluate types in a union in reverse order. For example, with the following schema:
{ "type": "record", "name": "MyRecord", "fields": [ {"name": "my_field", "type": ["boolean", "double"]} ] }
If I set my_field to a boolean in my data, it seems to be encoded as a double. However, if I reverse the order of the types in my union (["double", "boolean"]) it seems to be encoded as a boolean.
This seems unintuitive for a couple of reasons:
- I'd expect the types in the union to be evaluated in the order they are specified, but they seem to be evaluated in reverse order
- Encoding a boolean as a double is a bit weird
I'm not sure if this is a bug or expected behaviour though. If this is the expected behaviour (or it can't be changed without breaking things) then it would be nice if this was documented somewhere (I searched by couldn't find anything), as it's pretty unintuitive.
I've attached a full test case. The test case encodes and then decodes the data with both the original schema and the reversed version. For me it prints:
Type: <type 'float'> Type from reversed schema: <type 'bool'>
Ideally I'd expect the type to be 'bool' both times, but failing that I'd expect the type to be 'bool' the first time, and 'float' the second time.