Bug 40029

Summary: mod_proxy should interoperate with RPC over HTTP
Product: Apache httpd-2 Reporter: Hans Maurer <hans>
Component: mod_proxyAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: RESOLVED INVALID    
Severity: enhancement CC: allmers, graham, kecon, rvandolson
Priority: P5    
Version: 2.2.2   
Target Milestone: ---   
Hardware: PC   
OS: FreeBSD   
URL: http://some.server/rpc

Description Hans Maurer 2006-07-12 13:25:58 UTC
This is more of an enhancement request, and it might concern the core as well: 
When Apache is used to proxy HTTP/1.1 requests and it encounters unknown
methods, it should relay the content of both the request and the response body
parts as they arrive - i.e. without any blocking, buffering or delaying.

Background:

I'm trying to grant road warrior users access to our company Exchange server
through RPC over HTTP.  In my setup, an Apache 2.2.2 on a FreeBSD server in the
DMZ should act as a proxy between the Internet and the IIS on on the Exchange
server.  The communication is SSL-encrypted in both directions (SSLEngine and
SSLProxyEngine On).

Unfortunately, the Outlook client just hangs when trying to access Exchange
through the proxy.  The Apache error log shows these messages:

[Mon Jul 10 10:48:48 2006] [error] (70007)The timeout specified has expired:
proxy: prefetch request body failed to <exchangeip>:<port> (<exchange>) from
<clientip> ()

After working on this for quite some time, I believe I can rule out the usual
configuration and certificate problems that are described on various websites. 
Also, I have a Linux in my internal network with an older version of Apache
(2.0.53) where the same proxy configuration works (not too stable and
performant, but it does work).

I did some analysis with ssldump on both proxies.  Apparently, RPC over HTTP
opens two HTTP/1.1 requests:  One with request method RPC_IN_DATA to send data
to the server, and one with method RPC_OUT_DATA to send data back to the client.
 The body consists of raw binary data, and the connections are apparently
re-used for several RPCs.

I.e. after sending the headers for both connections, the client sends a request
on the IN connections, reads the response from the OUT connection, sends another
request on the IN connection and so on - which means that any buffering in the
proxy is absolutely deadly in this scenario.

Here's an example of an IN connection header:
    RPC_IN_DATA /rpc/rpcproxy.dll?<exchange>:6002 HTTP/1.1
    Accept: application/rpc
    User-Agent: MSRPC
    Host: <proxy>
    Content-Length: 1073741824
    Connection: Keep-Alive
    Cache-Control: no-cache
    Pragma: no-cache
    Authorization: Basic <user/passwd>

And here's an example of an OUT connection header:
    RPC_OUT_DATA /rpc/rpcproxy.dll?<exchange>:6002 HTTP/1.1
    Accept: application/rpc
    User-Agent: MSRPC
    Host: <proxy>
    Content-Length: 76
    Connection: Keep-Alive
    Cache-Control: no-cache
    Pragma: no-cache
    Authorization: Basic <user/passwd>

ssldump on the Apache 2.2.2 machine shows that the RPC_OUT_DATA is correctly
forwarded to the Exchange server.  For the RPC_IN_DATA, OTOH, the proxy doesn't
even open a connection to the Exchange server.  I can only guess that's it's
trying to read (prefetch?) a part or all of the 1073741824 bytes
(Content-Length) before opening the session to the Exchange server.

Unfortunately, the client only sends a small request (~ 100 bytes) on the IN
connection and starts waiting for a response on the OUT connection. It never
gets one, though, since the request hasn't reached the Exchange server yet.

On the Apache 2.0.53 server, however, both requests are forwarded to the
Exchange server, and the body bits are also relayed in a direct and timely
manner.  I've tried an Apache 2.0.58 on the FreeBSD server, but that one doesn't
work, either.
Comment 1 Ruediger Pluem 2006-07-12 15:32:31 UTC
(In reply to comment #0)

> ssldump on the Apache 2.2.2 machine shows that the RPC_OUT_DATA is correctly
> forwarded to the Exchange server.  For the RPC_IN_DATA, OTOH, the proxy doesn't
> even open a connection to the Exchange server.  I can only guess that's it's
> trying to read (prefetch?) a part or all of the 1073741824 bytes
> (Content-Length) before opening the session to the Exchange server.

Correct, we prefetch the whole body to avoid HTTP smuggling attacks with invalid
Content-Length headers. This is a security fix in 2.2.x and >= 2.0.55. (see
http://httpd.apache.org/security/vulnerabilities_20.html and
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2005-2088). Correct me if I am
wrong but I do not think that RPC_IN_DATA and RPC_OUT_DATA are specfied in any RFC.

> 
> Unfortunately, the client only sends a small request (~ 100 bytes) on the IN
> connection and starts waiting for a response on the OUT connection. It never
> gets one, though, since the request hasn't reached the Exchange server yet.

This is an incorrect use of the http protocol. Bad luck for Microsoft.

So I do not see any chance that we can do anything here. => Invalid
Comment 2 Hans Maurer 2006-07-12 16:11:23 UTC
So, there's no chance for "be generous in what you accept"? (Apache is already
"strict in what it delivers")
Comment 3 Ruediger Pluem 2006-07-12 20:57:50 UTC
Sorry, not in this case as this opens up a security hole if we are not strict here.
Comment 4 Joe Orton 2006-07-13 10:14:36 UTC
The fix for CVE-2005-2088 was simply to discard the C-L header if a T-E header
was also present, that was a change to request.c and the changes to the proxy
are entirely unrelated.

I don't see why this shouldn't work; the requests are syntactically valid, the
proxy doesn't are about method semantics.  Why is it timing out?  Because it
attempts to "prefetch" 8K and the 100 bytes sent are not enough? That is pretty
icky.
Comment 5 Hans Maurer 2006-07-13 16:53:47 UTC
(In reply to comment #4)
> Why is it timing out?  Because it
> attempts to "prefetch" 8K and the 100 bytes sent are not enough?

Yes, I think so.  Outlook sends 100 bytes, and then waits for a response on
the second connection before sending additional data.  OTOH, Apache waits for
additional data before relaying the 100 bytes to the Exchange server in the
first place.  Classic deadlock.
Comment 6 ml 2009-06-06 15:09:52 UTC
Hello,

with Apache 2.2.11 it is still not functioning. Microsoft released a protocol specification here: http://msdn.microsoft.com/en-us/library/cc243950.aspx

Does that change anything? Is there a chance that apache will support rpc over http in the future? I think there is a major interest to use apache as a proxy for possibly insecure IIS applications like this. thank you!

Regards
Christoph Kling
Comment 7 Nick Kew 2009-06-07 00:40:26 UTC
(In reply to comment #6)
> Hello,
> 
> with Apache 2.2.11 it is still not functioning. Microsoft released a protocol
> specification here: http://msdn.microsoft.com/en-us/library/cc243950.aspx

Quoting from that:

"Patents. Microsoft has patents that may cover your implementations of the technologies described in the Open Specifications. Neither this notice nor Microsoft's delivery of the documentation grants any licenses under those or any other Microsoft patents."

MS is not a standards body, and their having published something doesn't mean it's been peer-reviewed, or even implemented!  I don't see any guarantee that if Apache implements exactly what's written, it really will interoperate with MS technology.  Maybe there's a test suite, but who would look that far after reading the patent threat?
Comment 8 Emmanuel Fusté 2010-07-13 08:59:25 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > Hello,
> > 
> > with Apache 2.2.11 it is still not functioning. Microsoft released a protocol
> > specification here: http://msdn.microsoft.com/en-us/library/cc243950.aspx
> 
> Quoting from that:
> 
> "Patents. Microsoft has patents that may cover your implementations of the
> technologies described in the Open Specifications. Neither this notice nor
> Microsoft's delivery of the documentation grants any licenses under those or
> any other Microsoft patents."
> 

Sorry I could not buy this kind of argument.
It is not about implementing RPC over http, but about letting RCP over http pass through.
This is a silly Apache httpd limitation for no good.
Comment 9 Emmanuel Fusté 2010-07-13 09:00:12 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > Hello,
> > 
> > with Apache 2.2.11 it is still not functioning. Microsoft released a protocol
> > specification here: http://msdn.microsoft.com/en-us/library/cc243950.aspx
> 
> Quoting from that:
> 
> "Patents. Microsoft has patents that may cover your implementations of the
> technologies described in the Open Specifications. Neither this notice nor
> Microsoft's delivery of the documentation grants any licenses under those or
> any other Microsoft patents."
> 

Sorry I could not buy this kind of argument.
It is not about implementing RPC over http, but about letting RPC over http pass through.
This is a silly Apache httpd limitation for no good.
Comment 10 William A. Rowe Jr. 2010-07-16 11:37:59 UTC
Hans, this is an HTTP protocol question, unrecognized methods are allowed
but they must follow HTTP/1.1 itself, and if MS's protocol isn't HTTP/1.1
compliant, we won't be accommodating.

HTTP/1.1 is not bi-sync, it is message/resource oriented.

You have stated that 100 bytes of the request message are sent, and that we
are blocking for 8kb; what is the Content-Length header of this case?

There is a not-altogether unreasonable solve to this but it's not trivial; use 
proxy_connect for specific methods which turn out to be HTTP/1.1 non-compliant,
which would turn the tunnel into a connection stream.  Patches welcome.
Comment 11 Hans Maurer 2010-07-16 13:35:26 UTC
William,

since you're addressing me personally:  We've move to a VPN-based solution long ago, so I don't need this functionality anymore.

However, there several comments votes by other people, and I also occasionally get mails from people asking me how I worked around this problem.  So, there still seems to be some demand for this feature out there.

Best regards,
  Hans
Comment 12 William A. Rowe Jr. 2010-07-16 14:14:50 UTC
If you can remember that the 100-byte request carried 100 byte content-length, or was definitely larger, that would help.

Anyone; if you have the opportunity to sniff the httpd -> backend connection and
post what leads to this hang on the near side of the conversation, that would be 
great.
Comment 13 Hans Maurer 2010-07-16 14:23:03 UTC
William,

actually, there were(*) two parallel HTTP requests, one for traffic from Outlook to Exchange and one for traffic from Exchange to Outlook.  The "upstream" request had a Content-Length header of about 2 GB.  The initial ticket description (and my other comments from 2006) contain the headers of both requests as well as an analysis why Apache's buffering causes a deadlock on Microsoft's - well - creative RPC over HTTP implementation.

(*) That was Outlook 2003 with Exchange 2003.  I never checked for Outlook 2007/2010 and Exchange 2008.

Best regards,
  Hans
Comment 14 Hans Maurer 2010-07-16 14:24:48 UTC
(In reply to comment #13)
> The "upstream" request had a Content-Length header of about 2 GB.

Oh, actually 1 GB, not 2 GB.
Comment 15 William A. Rowe Jr. 2010-07-16 14:36:16 UTC
Ok, so that's a 1GB  'open ended' pipe, and it expected synchronicity which 
it's absolutely not allowed to do.

Thanks for clarifying, that's what I thought you were getting at.

Will ponder interesting solutions, but in the interim it is HTTP/1.1 abuse.
Comment 16 William A. Rowe Jr. 2010-07-16 14:36:53 UTC
Still/now 'new'
Comment 17 Graham Mainwaring 2010-07-16 15:46:57 UTC
I don't think it's HTTP 1.1 abuse. I can't find anything in RFC2068 that speaks to this point one way or the other. I don't think it's required for mod_proxy to implement this, but it certainly would not be RFC-violating to do so.

Suppose a back-end server sends a response with content-length: 1000 but only sends 100 bytes. In that case mod_proxy transmits the 100 bytes to the requestor and waits for further data.

But if the incoming request has content-length: 1000 but only sends 100 bytes, mod_proxy just sits there. Why not open the request to the back-end right away? It would improve performance even on regular GET requests, because you're making use of time that would otherwise be wasted on network latency to get the back-end connection open, which means you'll be able to generate a response that much faster. If you have all the HTTP headers and part of the request data, what's the benefit in *not* starting the connection to the back-end, since you know you're going to need it?
Comment 18 Graham Mainwaring 2010-07-16 15:56:29 UTC
Also, regarding vulnerability CVE-2005-2088, surely this can be solved by improving the header parsing rather than by destroying useful functionality that people were using.
Comment 19 Ray Van Dolson 2010-07-27 14:50:05 UTC
I understand the ethical reasons for wanting to implement this, but it would be nice to have perhaps some override options we could specify in a per <Directory> context.

I was initially thinking of pre-mangling the Content-Length header, but I doubt this would consistently provide the desired result.

Apache does have other "workarounds" for goofy non-compliant stuff...
Comment 20 Ray Van Dolson 2010-07-27 14:50:51 UTC
(In reply to comment #19)
> I understand the ethical reasons for wanting to implement this, but it would be

This should be "for *not* wanting". :)
Comment 21 William A. Rowe Jr. 2010-07-27 15:05:09 UTC
>> Why not open the request to the back-end right away?
>> It would improve performance even on regular GET requests

In fact, it does nothing of the kind, it increases the contention for 
the backend servers.  The current behavior is correct for taking the stress
of the much more computationally intensive backend applications.

HTTP/1.1 Content-Length description is prescriptive of the server's behavior,
and this incorrectly implemented protocol could have *trivially* used the
semantically sensible chunked encoding methodology.

The one and only hack around HTTP/1.1 non-compliance is to open a connection 
oriented stream and distrust the entire communications stream.
Comment 22 William A. Rowe Jr. 2010-09-24 02:20:07 UTC
The question was raised; "Microsoft released a spec"

In this case, only the IETF defines HTTP.

If it complies with HTTP, then anyone is free to build upon it.  See the DAV spec for one example.

If it fails to comply with HTTP, it isn't HTTP, and the ASF HTTP Server project is unlikely to pay attention; *particularly* if it masquerades as HTTP and is not.
Comment 23 William A. Rowe Jr. 2010-09-24 02:22:51 UTC
Final note of the day; I've broached the question on the ietf-http-wg list for pointers to any MS bug or KB references to this misimplementation, and pointers to where users can raise the issue.  I have yet to hear back, but when I do I'll update this report.

In the interim, after lengthy consideration, this is not an httpd proxy flaw.
Comment 24 Micha Lenk 2013-04-25 13:32:59 UTC
During my work at Astaro, I wrote an Apache module mod_proxy_msrpc that intends
to work around the mentioned limitations of Apache httpd by switching to a
transparent tunnel mode (similar to mod_proxy_connect) as soon as the RPC
connection has been successfully negotiated between client and server. It is
available on GitHub here: https://github.com/bombadil/mod_proxy_msrpc