Thrift
  1. Thrift
  2. THRIFT-669

Use Http chunk encoding to do full duplex transfer in a single post

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.2
    • Fix Version/s: None
    • Component/s: Java - Library
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      Instead of each method call being a separate post, use chunk-encoded request. If you look at the traffic in wireshark many times the payload is much smaller than the HTTP header. Using chunk encoding, the per method overhead of the http header is gone. Running a simple test of getting a time as i32, using http post vs chunk encoding I got from 100+ms to ~40ms per request as the servlet container did not have to process the overhead of a "new request".

      More I think with jetty and continuation the long running connections could actually scale and perform a lot better than the current HttpClient.

        Activity

        Hide
        Aron Sogor added a comment -

        Chunk encoding client

        Show
        Aron Sogor added a comment - Chunk encoding client
        Hide
        Bryan Duxbury added a comment -

        Does this buffer more than one method call into a single request?

        Show
        Bryan Duxbury added a comment - Does this buffer more than one method call into a single request?
        Hide
        Aron Sogor added a comment -

        It is a single request BUT No buffer. Http chunk encoding allows for sending a the request and the response in chunks.
        So each method call is a request chunk each response is a response chunk.

        In pseudo:

        <HTTP REQUEST HEADER>
        <chunk1 request>
        <HTTP RESPONSE HEADER>
        <chunk1 response>
        <chunk2 request>
        <chunk 2 response>

        Here is a capture from wireshark(no color) calling gettime in a loop:

        CONNECT /ds/ HTTP/1.1
        Host: localhost:8080

        User-Agent: BattleNet

        Transfer-Encoding: chunked

        content-type: application/x-thrift

        Accept: /

        11

        ........time.....HTTP/1.1 200 OK

        Content-Type: application/x-thrift

        Transfer-Encoding: chunked

        Server: Jetty(7.0.1.v20091125)

        18

        ........time............

        11

        ........time.....

        18

        ........time............

        11

        ........time.....

        <and many more>

        This is different from the current http client where each method call creates an - HTTP REQUEST/RESPONSE HEADER
        Check with wireshark

        Show
        Aron Sogor added a comment - It is a single request BUT No buffer. Http chunk encoding allows for sending a the request and the response in chunks. So each method call is a request chunk each response is a response chunk. In pseudo: <HTTP REQUEST HEADER> <chunk1 request> <HTTP RESPONSE HEADER> <chunk1 response> <chunk2 request> <chunk 2 response> Here is a capture from wireshark(no color) calling gettime in a loop: CONNECT /ds/ HTTP/1.1 Host: localhost:8080 User-Agent: BattleNet Transfer-Encoding: chunked content-type: application/x-thrift Accept: / 11 ........time.....HTTP/1.1 200 OK Content-Type: application/x-thrift Transfer-Encoding: chunked Server: Jetty(7.0.1.v20091125) 18 ........time............ 11 ........time..... 18 ........time............ 11 ........time..... <and many more> This is different from the current http client where each method call creates an - HTTP REQUEST/RESPONSE HEADER Check with wireshark
        Hide
        Aron Sogor added a comment -
        Show
        Aron Sogor added a comment - I wrote up some more explonation : http://hungariannotation.blogspot.com/2010/02/no-rpc-for-async-io-case-for-flash-as3.html also sent a patch for JAva and As3 in https://issues.apache.org/jira/browse/THRIFT-518
        Hide
        Bryan Duxbury added a comment -

        I'd kind of like to commit this patch, but there are a few oddities I think we might want to address.

        In read(), there's an else case that prints some anonymous-looking variable to stdout. Is this intentional, or debugging?

        In write, the user agent is set as "BattleNet". Seems misleading. Maybe "TFullDuplexHttpClient" instead?

        Would be nice to move the constructor closer to the top of the file.

        open() should probably throw TTransportExceptions in the case of problems instead of just printing the stack trace.

        The file is indented with tabs. We use spaces.

        Show
        Bryan Duxbury added a comment - I'd kind of like to commit this patch, but there are a few oddities I think we might want to address. In read(), there's an else case that prints some anonymous-looking variable to stdout. Is this intentional, or debugging? In write, the user agent is set as "BattleNet". Seems misleading. Maybe "TFullDuplexHttpClient" instead? Would be nice to move the constructor closer to the top of the file. open() should probably throw TTransportExceptions in the case of problems instead of just printing the stack trace. The file is indented with tabs. We use spaces.
        Hide
        Tony Kinnis added a comment - - edited

        It looks like this client makes assumptions in the read method that are not valid in some cases. While the initial request states it is using Transfer-Encoding: chunked that does not mean that the server will respond with a response that is using Transfer-Encoding: chunked. There are other aspects of the HTTP specification that are also not implemented.

        This client seems too limited to be used in many environments since it assumes that the server will always return a chunked response and nothing else.

        There was a recent commit to THttpClient, THRIFT-970, that allows it to use apache httpcomponents client which is more complete and can support chunking. Maybe this implementation could be adapted to use apache httpcomponents based on that example instead of rolling its own http client?

        THttpClient currently buffers the response in the flush method, in theory, to prevent clients from exhausting ephemeral ports and holding open connections on the servers. It would have to be implemented slightly differently in this class to achieve a buffer-less read but it wouldn't be too different from what is currently implemented in THttpClient.

        Show
        Tony Kinnis added a comment - - edited It looks like this client makes assumptions in the read method that are not valid in some cases. While the initial request states it is using Transfer-Encoding: chunked that does not mean that the server will respond with a response that is using Transfer-Encoding: chunked. There are other aspects of the HTTP specification that are also not implemented. This client seems too limited to be used in many environments since it assumes that the server will always return a chunked response and nothing else. There was a recent commit to THttpClient, THRIFT-970 , that allows it to use apache httpcomponents client which is more complete and can support chunking. Maybe this implementation could be adapted to use apache httpcomponents based on that example instead of rolling its own http client? THttpClient currently buffers the response in the flush method, in theory, to prevent clients from exhausting ephemeral ports and holding open connections on the servers. It would have to be implemented slightly differently in this class to achieve a buffer-less read but it wouldn't be too different from what is currently implemented in THttpClient.

          People

          • Assignee:
            Unassigned
            Reporter:
            Aron Sogor
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development