Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Resolved
-
1.7.4
-
None
-
None
Description
We can't get a correct result when decoding enums using resolving decoder. e.g.
{ "type" : "record", "name" : "TestEnum", "fields" : [ { "name" : "MyMode", "type" : { "type" : "enum", "name" : "Mode", "symbols" : [ "MEMORY", "DISK" ] } } ] }
We encoded "DISK"(1), then decoded with resolving decoder, got "MEMORY"(0).
I examined the code and found that there is a sort after reading names of reader.
I could't quite understand the author's intention, but it really can not work well.
When decoding my enum, the return value is actually the position of the sorted names, and obviously it's not correct.
Symbol Symbol::enumAdjustSymbol(const NodePtr& writer, const NodePtr& reader) { vector<string> rs; size_t rc = reader->names(); for (size_t i = 0; i < rc; ++i) { rs.push_back(reader->nameAt(i)); } sort(rs.begin(), rs.end()); // the strange sort
Here is my complete test case.
enum Mode {
MEMORY,
DISK,
};
struct TestEnum {
Mode MyMode;
};
#include "ts_enum.h" #include "avro/Compiler.hh" #include "avro/ValidSchema.hh" using namespace std; using namespace avro; using namespace enum_test; static const char ts_schema_string[] = "{ \"type\" : \"record\", \"name\" : \"TestEnum\", \"fields\" : " "[ { \"name\" : \"MyMode\", \"type\" : " "{ \"type\" : \"enum\", \"name\" : \"Mode\", " "\"symbols\" : [ \"MEMORY\", \"DISK\" ] } } ]}"; int main(int argc, char * argv[]) { TestEnum te1, te2; ValidSchema reader = compileJsonSchemaFromString(ts_schema_string); ValidSchema writer = compileJsonSchemaFromString(ts_schema_string); //encode TestEnum auto_ptr<OutputStream> out_stream = memoryOutputStream(); EncoderPtr encoder = binaryEncoder(); encoder->init(*out_stream); te1.MyMode = DISK; encode(*encoder, te1); encoder->flush(); //decode TestEnum auto_ptr<InputStream> in_stream = memoryInputStream(*out_stream); DecoderPtr decoder = resolvingDecoder(writer, reader, avro::binaryDecoder()); decoder->init(*in_stream); decode(*decoder, te2); cout<<"TE1: "<<te1.MyMode << " | TE2: "<<te2.MyMode<<endl; return 0; }
The result
-------------------
TE1: 1 | TE2: 0
I debuged into avro code.
In Symbol::enumAdjustSymbol, there is a vector<string> of reader's enum names, and after the sort, "MEMOEY, DISK" turned to be "DISK, MEMORY".
At last, a vector<int> of writer's enum names saved every position of the sorted vector<string>.
As a result, in the returned symbol, MEMORY's position is 1 and DISK's position is 0.
Finally, when we decoding the enum, the position is returned to the target object.
I could't quite understand the author's intention here but when I commented the sort, everything worked well.