btw, Chris Nauroth, is the use case that upgraded-client with non-upgraded NM important ?
I brought this up, because I've been in situations where someone wanted to pick up a client-side bug fix ahead of the cluster's upgrade schedule. It looks to me like this is a gray area in our policies though.
From the content in that page, we've made a specific commitment that old clients continue to work with new servers. As Jian said, that part is fine with this patch. What is less clear is whether or not we've made a commitment for new clients to work with old servers. Of course, it's best to strive for it, and forward compatibility is one of our motivations in the protobuf messages, but I can't tell from that policy statement if we've made a commitment to it. This is probably worth some wider discussion before changing the patch.
If we do need to achieve that kind of compatibility, then it's going to be a more challenging patch. I think we'd end up needing to add an optional version number or at least a flag on the Container returned in the AllocateResponse. This would tell the client whether or not the container can accept the new syntax, and then the client could use the old code path as a fallback path for compatibility with old servers that don't set this version number or flag. That would work for containers submitted by an AM. I can't think of a similar solution that would work for the initial AM container though, because it seems to me like the RPC sequence there doesn't have as clear of a way for indicating capabilities inside the container that's going to run the AM before its submission.
Like I said, please do discuss wider before pursuing this. I'd hate to send you down an unnecessary rathole if the current patch is fine. Thanks, Jian.