Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
Thrift is already designed to have modular Protocols, which define the various elements of binary encoding. However, the way that our code is generated, we're also implicitly defining every protocol to have a structure like that of the Binary protocol.
This has served us well so far with it's simplicity, and other protocols (like Compact and JSON) have been able to squeeze into this model successfully, though at a noticeable cost. For instance, Compact protocol has to keep its own internal state in a stack to know what the last field id we wrote was so that it can write compact deltas. The JSON protocol is even more complex, keeping a truly monstrous amount of state around in order to be able to do the right thing at the right time. All this state tracking comes with cost, both in complexity and in actual runtime.
I think that there's a solution to this problem - making the serialization code itself pluggable. That is, rather than there being a single "write" and "read" method for every protocol, we could let Protocols require a certain kind of Serializer to interact with them. Binary protocol could use the "default tagged" serializer, which would look like what we have today. JSON would probably use a custom one that basically let it write out strings and nothing else. Compact would also likely use a custom serializer that kept all the needed state on the system stack, rather than in a user-code stack. Dare I say it, but it's possible that if we did it right, we could even serialize Thrift structs with Avro's serialization format!
The upshot of this change would be to make many protocols faster and simpler, while also allowing us to implement a broader set of protocols that don't fit with the traditional Thrift-style protocols.
Attachments
Attachments
Issue Links
- is depended upon by
-
THRIFT-1239 TupleProtocol- An extremely compact, temporary protocol
- Closed