Description
The following record fails to compile with the specific compiler:
{"name": "ipAddr", "type": "record", "fields":[ {"name": "addr", "type": [ {"name": "IPv6", "type": "fixed", "size": 16 }, {"name": "IPv4", "type": "fixed", "size": 4 }] } ] }
The stack trace is:
org.apache.avro.AvroRuntimeException: Ambiguous union: [{"type":"fixed","name":"IPv6","size":16},{"type":"fixed","name":"IPv4","size":4}] at org.apache.avro.Schema$UnionSchema.<init>(Schema.java:613) at org.apache.avro.Schema.parse(Schema.java:874) at org.apache.avro.Schema.parse(Schema.java:825) at org.apache.avro.Schema.parse(Schema.java:709) at org.apache.avro.specific.SpecificCompiler.compileSchema(SpecificCompiler.java:111)\
This is on trunk:
$ svn info
Path: .
URL: http://svn.apache.org/repos/asf/hadoop/avro/trunk/lang/java
Repository Root: http://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 901469
The code for UnionSchema in Schema.java has this constructor:
public UnionSchema(List<Schema> types) { super(Type.UNION); this.types = types; int seen = 0; for (Schema type : types) { // check legality of union switch (type.getType()) { case UNION: throw new AvroRuntimeException("Nested union: "+this); case RECORD: if (type.getName() != null) continue; default: int mask = 1 << type.getType().ordinal(); if ((seen & mask) != 0) throw new AvroRuntimeException("Ambiguous union: "+this); seen |= mask; } } }
That allows only one member of any type other than RECORD. The spec says:
Unions may not contain more than one schema with the same
type, except for the named types record, fixed and enum.
The code above does not adhere to this.
I am attaching a patch for only this code, but a unit test with a test schema that has two records, two fixed, and two enum in it as well as one of each of the unnamed types is probably necessary as well. I am not yet familiar with the test infrastructure.
I am also not familiar with what else this may impact.
Attachments
Attachments
Issue Links
- relates to
-
AVRO-362 Add test to ensure Python implementation handles Union schema with two fixed types of different names
- Closed