A patch with an alternative solution is attached to issue #
HADOOP-6949. This is a generalized fix for any protocol that pumps arrays of primitives (long, byte, etc.) through the RPC and ObjectWritable mechanisms, including block reports and journaling.
It has the following benefits:
1. it is fully backward compatible with any persisted data that may be out there, because it can still successfully receive/read the old format
2. it has minimal changes to the ObjectWritable module, and doesn't require any changes to protocols or APIs
3. it is consistent with current practice regarding *Writable classes
4. It automatically benefits all protocols that send arrays of primitives through the RPC and ObjectWritable mechanisms, without changes to the protocol APIs
5. It has fully optimized wire format, with almost no object overhead, unlike solutions based on ArrayWritable.
That said, while it is backward compatible with old stored data, it is admittedly not backward compatible with older versions of Client software: A NEWER RPC RECEIVER can correctly receive data from an OLDER SENDER, but an OLDER RECEIVER will not correctly receive arrays of primitives sent from a NEWER SENDER. I haven't been able to think of a solution without getting into protocol negotiation schemes.
I understand that Liyin's proposed solution, by adding new signatures to the journaling protocol, could have the benefit of backward compatibility with old Clients. If that is critical, then we must go that way. But that way only benefits the specific APIs that get modified, and adds significant maintenance complexity in the long run. I would also point out that Liyin's current patch does NOT provide that backward compatibility, because it replaces the protocol signatures rather than extending them. If the current patch is acceptable, meaning we don't need Client API backward compatibility, then I strongly recommend the more general solution of
As far as performance goes, I measured the RPC overhead time in sending a 50,000-block Block Report in the form of an array of long. (That's the problem that got me working on this.) The current RPC round-trip overhead in a system with no resource contention is about 180 msec. With my proposed patch, and no code change to the calling code, this is reduced to only 50 msec, i.e. an 80% reduction. The size reduction is comparable to what Liyin measured, for similar reasons.