How is it going? Do you need help?
I've practiced a bit with ProtoBuf.
It seems to be really cool and fast (or at least small).
BUT I have really problems with the question: how we are (or better we could be) able to integrate it.
Let's summarize our current state:
We have an abstract class BSPMessage which leaves the types of tags and data to the concrete implementations.
This gets using Writable into Hadoop's RPC mechanism and gets serialized and deserialized.
What I'm wondering now is how we can improve this using ProtoBuf.
ProtoBuf needs a "*.proto" file that needs to be compiled to a specific model *.java file. In this file you are declaring what the message needs, in our case this is not known at compile time (for example a user implements a custom vertex that contains distances or something like that). So we have to leave the serialization up to the user.
The question is how could we doing this?
Here some thoughts (don't take this too serious, just some brainstorming):
There are two options:
- A generic ".proto" model that takes just two Strings for a tag and data
- We leave the compiling and implementing of the protos to the user
The first is ultra simple and we don't have to worry about anything the user will submit, since you can serialize everything to strings.
But I think we are going to "ruin" the optimizations that COULD have been made if the type was known.
The second option is really messy and not too user friendly (since he has to compile the whole stuff and put it into a repository at runtime that we can know the proto), but in constrast to the first option it could result in better results.
Do we have other opportunities?