I don't want this JIRA to devolve into a discussion of the merits and demerits of various serialization frameworks. In the past those discussions have been what resulted in us picking no framework instead of just getting it done with something.
That said, here is my quick summary of why I picked protobufs vs Avro and Thrift:
Avro is a fantastic data serialization framework with the following unique features: (a) dynamic schema stored with the data, (b) very compact storage format, (c) a standardized container format (d) Java-based codegen that integrates easily into a build. Features A, B, and C are very good when you want to store a lot of data on disk: it's small, you can read it without knowing what someone else wrote, and it's splittable and compressible in MR. D is great since you don't need to make developers install anything.
For the case of the DataTransferProtocol and Hadoop RPC in general, features A through C are less useful. The different parts of HDFS may divolve slightly over time but there's no need to talk to a completely unknown server. Compactness is always a plus, but a 10-20% improvement on compactness of header data only translates to a <1% improvement of compactness on data transfer, since the ratio of data:header is very high. The storage format doesn't help any for RPC – this is transient.
In addition, the dynamic nature of Avro requires the readers and writers know the schema of their peer in order to communicate. This has to be done with a handshake of some kind. It would certainly be possible to implement this, but in order to do it without an extra round trip you need to add schema dictionaries, hashes, etc. Plus, the peer's schema needs to be threaded throughout the places where serialization/deserialization is done. This is possible, but I didn't want to do this work.
Thrift vs Protobufs
I like Thrift a lot – in fact I'm a Thrift committer and PMC member. So it might seem strange that I didn't pick Thrift. Here's my thinking:
- Thrift and Protobuf are more or less equivalent: tagged serialization, codegen tool written in C++, good language support, mature wire format
- Thrift has the plus side that it's a true open source community at the ASF with some committer overlap with the people working on Hadoop
- Protobufs has the plus side that, apparently, MR2/YARN has chosen it for their RPC formats.
- Protobuf has two nice features that thrift doesn't have yet: 1) when unknown data is read, it is maintained in a map and then put back on the wire if the same object is rewritten. 2) it has a decent plugin system that makes it easy to modify the generated code – even with a plugin written in python or Java, in theory. These could be implemented in Thrift, but again, I didn't want to take the time.
- Thrift's main advantage vs protobufs is a standardized RPC wire format and set of clients/servers. I don't think the Java implementations in Thrift are nearly as mature as the Hadoop RPC stack, and swapping out for entirely new RPC transport is a lot more work than just switching serialization mechanisms. Since we already have a pretty good (albeit nonstandard) RPC stack, this advantage of Thrift is less of a big deal.
- In the end I was torn between protobufs and Thrift. Mostly since MR2 uses Protobuf already, I just went with it.
- I think protobufs is a good choice for wire format serialization. I still think Avro is a good choice for disk storage (eg perhaps using Avro to store a denser machine-readable version of the audit log). These two things can coexist just fine.
- There is working code attached to this JIRA. If you disagree with my thinking above, please do not post a comment; post a patch with your serialization of choice, showing that the code is at least as clean, the performance is comparable, and the tests pass.
- IMO, there are a lot of interesting things to work on and discuss in Hadoop; this is not one of them. Let's just get something (anything) in there that works and move on with our lives.
So, assuming I have support from a couple of committers, I will move forward to clean up this patch. As you can see above, it already works modulo some bug with block token. With some more comments, a little refactoring, and a build target to regenerate code, I think we could commit this.