The simplest way to fix this is to always create a new object, but that's won't work well until
HADOOP-1230 is done.
You could also use a container object like in
HADOOP-4065?? Or require that all the thrift fields have the required attribue - at least a comment?
For this and RecordSerialization
HADOOP-4199, there's also the issue that they are both by default using Binary format whereas thrift, record io support multiple formats. If thrift finally implements a compacted binary format, this will be even more important since people will have both.
The other thing is Hive has something called TCTLSeparatedProtocol which implements the Thrift Protocol interface and allows thrift to parse simple text files with ctl separators. For us, we definitely have data in both Binary and CTL seped, so would need a way to configure this.
But, I think those are add ons and you could submit this?
Also, can someone create a category for contrib/serialization?