Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change
    • Target Version/s:

      Description

      There are a few inconsistencies in the current RPC protocol that make new client implementation (especially non-blocking) unnecessarily hard. For example:

      1. proto file naming
        RpcPayloadHeader.proto include not only RpcPayLoadHeaderProto, but also RpcResponseHeaderProto, which is irrelevant to the file name.
        hadoop_rpc.proto only include HadoopRpcRequestProto, and the filename "hadoop_rpc" is strange comparing to other .proto file names.
        How about merge those two file into HadoopRpc.proto?
      2. proto class naming
        In rpc request RpcPayloadHeaderProto includes callId, but in rpc response callId is included in RpcResponseHeaderProto, and there is also HadoopRpcRequestProto, this is just too confusing.
      3. The rpc system is not fully protobuf based, there are still some Writables:
        RpcRequestWritable, RpcResponseWritable, rpc response exception name, stack trace string and sasl request/response.
        RpcRequestWritable uses protobuf style varint32 prefix, but RpcResponseWritable uses int32 prefix, why this inconsistency?

      Currently rpc request is splitted into length, PayLoadHeader and PayLoad, and response into RpcResponseHeader, response and error message. Wrapping request and response into single RequestProto and ResponseProto would be better, since this gives a formal complete wire format definition.

      These issues above make it very confusing and hard for developers to use these rpc interfaces.

        Activity

        Hide
        Sanjay Radia added a comment -

        1. proto file naming

        2. proto class naming

        Create subtask HADOOP-9140 to address 1 and 2.
        Note hadoop-rpc.proto is for the protobuf RpcEngine and cannot hence the two protos are distict and should not be combined.

        Show
        Sanjay Radia added a comment - 1. proto file naming 2. proto class naming Create subtask HADOOP-9140 to address 1 and 2. Note hadoop-rpc.proto is for the protobuf RpcEngine and cannot hence the two protos are distict and should not be combined.
        Hide
        Binglin Chang added a comment -

        Note hadoop-rpc.proto is for the protobuf RpcEngine and cannot hence the two protos are distict and should not be combined.

        Yes, this makes sense, Thanks.

        Show
        Binglin Chang added a comment - Note hadoop-rpc.proto is for the protobuf RpcEngine and cannot hence the two protos are distict and should not be combined. Yes, this makes sense, Thanks.
        Hide
        Sanjay Radia added a comment -

        3. The rpc system is not fully protobuf based, there are still some Writables: ... rpc response exception name and stack trace string.

        Have filed HADOOP-9151 to fix this.

        Show
        Sanjay Radia added a comment - 3. The rpc system is not fully protobuf based, there are still some Writables: ... rpc response exception name and stack trace string. Have filed HADOOP-9151 to fix this.
        Hide
        Sanjay Radia added a comment -

        there are still some Writables: RpcRequestWritable and RpcResponseWritable

        These are wrappers and not actual writables sent across the wire. They happen to be writable so that the client an server can call the write and read methods. I will create a jira to document this better.

        Show
        Sanjay Radia added a comment - there are still some Writables: RpcRequestWritable and RpcResponseWritable These are wrappers and not actual writables sent across the wire. They happen to be writable so that the client an server can call the write and read methods. I will create a jira to document this better.
        Hide
        Sanjay Radia added a comment -

        I have filed jiras for addressing all feedback plus and additional one for reducing buffer copies.
        The one left is varint32 and int32 inconsistency; this isn't a protobuf vs writable issue. I can fix it easily desired. Any feedback?

        Show
        Sanjay Radia added a comment - I have filed jiras for addressing all feedback plus and additional one for reducing buffer copies. The one left is varint32 and int32 inconsistency; this isn't a protobuf vs writable issue. I can fix it easily desired. Any feedback?
        Hide
        Binglin Chang added a comment -

        Thanks for all this, Sanjay. Let me summarize it to see if I fully understand it?
        The final wire format would be(I assume prefix is unified to 4 byte integer):
        request:

        4byte total length 4byte header len header 4byte request len request

        response:

        4byte total length 4byte header len header(with error string) 4byte response len(optional if error) response(optional if error)

        Still I prefer the most simple way.
        request:

        4byte length request(with request no type bytes)

        response:

        4byte length response(with optional response no type bytes and optional error)

        But if you choose the above solution, I'm also OK.

        What's the detail of avoiding buffer copy? I post in another thread that writedelimited to can't avoid buffer copy, and codedinputstream create a buffer to serialized, so unless we use one codeinputstream throughout code, it maybe be difficult otherwise.

        Show
        Binglin Chang added a comment - Thanks for all this, Sanjay. Let me summarize it to see if I fully understand it? The final wire format would be(I assume prefix is unified to 4 byte integer): request: 4byte total length 4byte header len header 4byte request len request response: 4byte total length 4byte header len header(with error string) 4byte response len(optional if error) response(optional if error) Still I prefer the most simple way. request: 4byte length request(with request no type bytes) response: 4byte length response(with optional response no type bytes and optional error) But if you choose the above solution, I'm also OK. What's the detail of avoiding buffer copy? I post in another thread that writedelimited to can't avoid buffer copy, and codedinputstream create a buffer to serialized, so unless we use one codeinputstream throughout code, it maybe be difficult otherwise.
        Hide
        Sanjay Radia added a comment -

        The final wire format would be(I assume prefix is unified to 4 byte integer):

        Yes except that on first len is 4bytes and the rest are varint. Note that since rpc has pluggable rpc engine, the rpc request itself is serialized based on the pluggable rpc engine. For protobuf-rpc-engine the length is serialized as varint.

        The individual lengths works well with the layering in the system. Wrt layering I mean both within the ipc/rpc layers and also the fact that we allow multiple rpc engines.

        Show
        Sanjay Radia added a comment - The final wire format would be(I assume prefix is unified to 4 byte integer): Yes except that on first len is 4bytes and the rest are varint . Note that since rpc has pluggable rpc engine, the rpc request itself is serialized based on the pluggable rpc engine. For protobuf-rpc-engine the length is serialized as varint. The individual lengths works well with the layering in the system. Wrt layering I mean both within the ipc/rpc layers and also the fact that we allow multiple rpc engines.
        Hide
        Binglin Chang added a comment -

        Note that since rpc has pluggable rpc engine, the rpc request itself is serialized based on the pluggable rpc engine. For protobuf-rpc-engine the length is serialized as varint.

        I don't understand why protobuf-rpc-engine has to be varint prefixd? I mean originally it is 4 byte int prefixed, and HADOOP-8084 changed it to varint prefixed, and then at some time later RpcResponseWritable changed to using varint, and RpcRequestWritable keep using 4 byte int.
        I just think 4 byte int is a better choice, because it easy to implement in non-blocking code.
        Just for example, see Todd's ipc implementation:
        https://github.com/toddlipcon/libhrpc/blob/master/rpc_client.cpp

          uint32_t length;
          CodedInputStream cis(&rbuf_[rbuf_consumed_], rbuf_available());
          bool success = cis.ReadVarint32(&length);
          if (!success) {
            if (rbuf_available() >= kMaxVarint32Size) {
              // TODO: error handling
              LOG(FATAL) << "bad varint";
            }
            // Not enough bytes in buffer to read varint, ask for at least another byte
            EnsureAvailableLength(rbuf_available() + 1,
                boost::bind(&RpcClient::ReadResponseHeaderLengthCallback, this,
                  asio::placeholders::error));
            return;
          }
        

        There are lot of retry logic to read varint in non-blocking code, it is much easier if it's fixed 4 byte, and hence unified.

        Show
        Binglin Chang added a comment - Note that since rpc has pluggable rpc engine, the rpc request itself is serialized based on the pluggable rpc engine. For protobuf-rpc-engine the length is serialized as varint. I don't understand why protobuf-rpc-engine has to be varint prefixd? I mean originally it is 4 byte int prefixed, and HADOOP-8084 changed it to varint prefixed, and then at some time later RpcResponseWritable changed to using varint, and RpcRequestWritable keep using 4 byte int. I just think 4 byte int is a better choice, because it easy to implement in non-blocking code. Just for example, see Todd's ipc implementation: https://github.com/toddlipcon/libhrpc/blob/master/rpc_client.cpp uint32_t length; CodedInputStream cis(&rbuf_[rbuf_consumed_], rbuf_available()); bool success = cis.ReadVarint32(&length); if (!success) { if (rbuf_available() >= kMaxVarint32Size) { // TODO: error handling LOG(FATAL) << "bad varint" ; } // Not enough bytes in buffer to read varint, ask for at least another byte EnsureAvailableLength(rbuf_available() + 1, boost::bind(&RpcClient::ReadResponseHeaderLengthCallback, this , asio::placeholders::error)); return ; } There are lot of retry logic to read varint in non-blocking code, it is much easier if it's fixed 4 byte, and hence unified.
        Hide
        Sanjay Radia added a comment -

        Won't the first 4byte length size (ie total len) be sufficient to allow reading the data in a non-blocking fashion.

        Show
        Sanjay Radia added a comment - Won't the first 4byte length size (ie total len) be sufficient to allow reading the data in a non-blocking fashion.
        Hide
        Binglin Chang added a comment -

        Sorry, I made a mistake, looks like currently RPC response doesn't have 4 byte total length, and you plan to add 4 byte total length to rpc response right? If that's the case, I think you are right, it is fine for non-blocking IO, although the prefix in rpc response body seams redundant.
        And since the total length is not known until response header and response body are serialized to some pre-allocated buffer to get the serialized size, so you plan to serialize header and body to some buffer first and then write to socket, IMO this is the same as using a single protobuf to include both header and body,

        Show
        Binglin Chang added a comment - Sorry, I made a mistake, looks like currently RPC response doesn't have 4 byte total length, and you plan to add 4 byte total length to rpc response right? If that's the case, I think you are right, it is fine for non-blocking IO, although the prefix in rpc response body seams redundant. And since the total length is not known until response header and response body are serialized to some pre-allocated buffer to get the serialized size, so you plan to serialize header and body to some buffer first and then write to socket, IMO this is the same as using a single protobuf to include both header and body,
        Hide
        Sanjay Radia added a comment -

        so you plan to serialize header and body to some buffer first and then write to socket

        That is what the current code does. This should be changed to use coded output stream to avoid the buffering; I believe this may require some more code change (not protocol changes) and that this can be done at later point.

        Show
        Sanjay Radia added a comment - so you plan to serialize header and body to some buffer first and then write to socket That is what the current code does. This should be changed to use coded output stream to avoid the buffering; I believe this may require some more code change (not protocol changes) and that this can be done at later point.
        Hide
        Binglin Chang added a comment -

        To archive this one potential issue is how to get the serialized total size before the serialization of header and body, for protobuf this can be done with "getSerializedSize", but other serialization system may not support this right, even it does, the interface is different so need to be addressed right?

        Show
        Binglin Chang added a comment - To archive this one potential issue is how to get the serialized total size before the serialization of header and body, for protobuf this can be done with "getSerializedSize", but other serialization system may not support this right, even it does, the interface is different so need to be addressed right?
        Hide
        Sanjay Radia added a comment -

        The general practice in protobuf is to make all fields optional.
        The current fields are required in our rpc protobufs - shall we make all or some of them optional?

        RpcRequestHeader
         required uint32 callId = 3; // each rpc has a callId that is also used in response
        
        RpcResponseHeader
          required uint32 callId = 1; // callId used in Request
          required RpcStatusProto status = 2;
        
        Protobuf engine's RpcRequestHeader
          required string methodName = 1;
          required string declaringClassProtocolName = 3;
          required uint64 clientProtocolVersion = 4;
        
        Show
        Sanjay Radia added a comment - The general practice in protobuf is to make all fields optional. The current fields are required in our rpc protobufs - shall we make all or some of them optional? RpcRequestHeader required uint32 callId = 3; // each rpc has a callId that is also used in response RpcResponseHeader required uint32 callId = 1; // callId used in Request required RpcStatusProto status = 2; Protobuf engine's RpcRequestHeader required string methodName = 1; required string declaringClassProtocolName = 3; required uint64 clientProtocolVersion = 4;
        Hide
        Binglin Chang added a comment -

        The current fields are required in our rpc protobufs - shall we make all or some of them optional?

        Agree, it is good for compatibility

        Show
        Binglin Chang added a comment - The current fields are required in our rpc protobufs - shall we make all or some of them optional? Agree, it is good for compatibility

          People

          • Assignee:
            Sanjay Radia
            Reporter:
            Binglin Chang
          • Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:

              Development