This is more of an enhancement request, and it might concern the core as well: When Apache is used to proxy HTTP/1.1 requests and it encounters unknown methods, it should relay the content of both the request and the response body parts as they arrive - i.e. without any blocking, buffering or delaying. Background: I'm trying to grant road warrior users access to our company Exchange server through RPC over HTTP. In my setup, an Apache 2.2.2 on a FreeBSD server in the DMZ should act as a proxy between the Internet and the IIS on on the Exchange server. The communication is SSL-encrypted in both directions (SSLEngine and SSLProxyEngine On). Unfortunately, the Outlook client just hangs when trying to access Exchange through the proxy. The Apache error log shows these messages: [Mon Jul 10 10:48:48 2006] [error] (70007)The timeout specified has expired: proxy: prefetch request body failed to <exchangeip>:<port> (<exchange>) from <clientip> () After working on this for quite some time, I believe I can rule out the usual configuration and certificate problems that are described on various websites. Also, I have a Linux in my internal network with an older version of Apache (2.0.53) where the same proxy configuration works (not too stable and performant, but it does work). I did some analysis with ssldump on both proxies. Apparently, RPC over HTTP opens two HTTP/1.1 requests: One with request method RPC_IN_DATA to send data to the server, and one with method RPC_OUT_DATA to send data back to the client. The body consists of raw binary data, and the connections are apparently re-used for several RPCs. I.e. after sending the headers for both connections, the client sends a request on the IN connections, reads the response from the OUT connection, sends another request on the IN connection and so on - which means that any buffering in the proxy is absolutely deadly in this scenario. Here's an example of an IN connection header: RPC_IN_DATA /rpc/rpcproxy.dll?<exchange>:6002 HTTP/1.1 Accept: application/rpc User-Agent: MSRPC Host: <proxy> Content-Length: 1073741824 Connection: Keep-Alive Cache-Control: no-cache Pragma: no-cache Authorization: Basic <user/passwd> And here's an example of an OUT connection header: RPC_OUT_DATA /rpc/rpcproxy.dll?<exchange>:6002 HTTP/1.1 Accept: application/rpc User-Agent: MSRPC Host: <proxy> Content-Length: 76 Connection: Keep-Alive Cache-Control: no-cache Pragma: no-cache Authorization: Basic <user/passwd> ssldump on the Apache 2.2.2 machine shows that the RPC_OUT_DATA is correctly forwarded to the Exchange server. For the RPC_IN_DATA, OTOH, the proxy doesn't even open a connection to the Exchange server. I can only guess that's it's trying to read (prefetch?) a part or all of the 1073741824 bytes (Content-Length) before opening the session to the Exchange server. Unfortunately, the client only sends a small request (~ 100 bytes) on the IN connection and starts waiting for a response on the OUT connection. It never gets one, though, since the request hasn't reached the Exchange server yet. On the Apache 2.0.53 server, however, both requests are forwarded to the Exchange server, and the body bits are also relayed in a direct and timely manner. I've tried an Apache 2.0.58 on the FreeBSD server, but that one doesn't work, either.
(In reply to comment #0) > ssldump on the Apache 2.2.2 machine shows that the RPC_OUT_DATA is correctly > forwarded to the Exchange server. For the RPC_IN_DATA, OTOH, the proxy doesn't > even open a connection to the Exchange server. I can only guess that's it's > trying to read (prefetch?) a part or all of the 1073741824 bytes > (Content-Length) before opening the session to the Exchange server. Correct, we prefetch the whole body to avoid HTTP smuggling attacks with invalid Content-Length headers. This is a security fix in 2.2.x and >= 2.0.55. (see http://httpd.apache.org/security/vulnerabilities_20.html and http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2005-2088). Correct me if I am wrong but I do not think that RPC_IN_DATA and RPC_OUT_DATA are specfied in any RFC. > > Unfortunately, the client only sends a small request (~ 100 bytes) on the IN > connection and starts waiting for a response on the OUT connection. It never > gets one, though, since the request hasn't reached the Exchange server yet. This is an incorrect use of the http protocol. Bad luck for Microsoft. So I do not see any chance that we can do anything here. => Invalid
So, there's no chance for "be generous in what you accept"? (Apache is already "strict in what it delivers")
Sorry, not in this case as this opens up a security hole if we are not strict here.
The fix for CVE-2005-2088 was simply to discard the C-L header if a T-E header was also present, that was a change to request.c and the changes to the proxy are entirely unrelated. I don't see why this shouldn't work; the requests are syntactically valid, the proxy doesn't are about method semantics. Why is it timing out? Because it attempts to "prefetch" 8K and the 100 bytes sent are not enough? That is pretty icky.
(In reply to comment #4) > Why is it timing out? Because it > attempts to "prefetch" 8K and the 100 bytes sent are not enough? Yes, I think so. Outlook sends 100 bytes, and then waits for a response on the second connection before sending additional data. OTOH, Apache waits for additional data before relaying the 100 bytes to the Exchange server in the first place. Classic deadlock.
Hello, with Apache 2.2.11 it is still not functioning. Microsoft released a protocol specification here: http://msdn.microsoft.com/en-us/library/cc243950.aspx Does that change anything? Is there a chance that apache will support rpc over http in the future? I think there is a major interest to use apache as a proxy for possibly insecure IIS applications like this. thank you! Regards Christoph Kling
(In reply to comment #6) > Hello, > > with Apache 2.2.11 it is still not functioning. Microsoft released a protocol > specification here: http://msdn.microsoft.com/en-us/library/cc243950.aspx Quoting from that: "Patents. Microsoft has patents that may cover your implementations of the technologies described in the Open Specifications. Neither this notice nor Microsoft's delivery of the documentation grants any licenses under those or any other Microsoft patents." MS is not a standards body, and their having published something doesn't mean it's been peer-reviewed, or even implemented! I don't see any guarantee that if Apache implements exactly what's written, it really will interoperate with MS technology. Maybe there's a test suite, but who would look that far after reading the patent threat?
(In reply to comment #7) > (In reply to comment #6) > > Hello, > > > > with Apache 2.2.11 it is still not functioning. Microsoft released a protocol > > specification here: http://msdn.microsoft.com/en-us/library/cc243950.aspx > > Quoting from that: > > "Patents. Microsoft has patents that may cover your implementations of the > technologies described in the Open Specifications. Neither this notice nor > Microsoft's delivery of the documentation grants any licenses under those or > any other Microsoft patents." > Sorry I could not buy this kind of argument. It is not about implementing RPC over http, but about letting RCP over http pass through. This is a silly Apache httpd limitation for no good.
(In reply to comment #7) > (In reply to comment #6) > > Hello, > > > > with Apache 2.2.11 it is still not functioning. Microsoft released a protocol > > specification here: http://msdn.microsoft.com/en-us/library/cc243950.aspx > > Quoting from that: > > "Patents. Microsoft has patents that may cover your implementations of the > technologies described in the Open Specifications. Neither this notice nor > Microsoft's delivery of the documentation grants any licenses under those or > any other Microsoft patents." > Sorry I could not buy this kind of argument. It is not about implementing RPC over http, but about letting RPC over http pass through. This is a silly Apache httpd limitation for no good.
Hans, this is an HTTP protocol question, unrecognized methods are allowed but they must follow HTTP/1.1 itself, and if MS's protocol isn't HTTP/1.1 compliant, we won't be accommodating. HTTP/1.1 is not bi-sync, it is message/resource oriented. You have stated that 100 bytes of the request message are sent, and that we are blocking for 8kb; what is the Content-Length header of this case? There is a not-altogether unreasonable solve to this but it's not trivial; use proxy_connect for specific methods which turn out to be HTTP/1.1 non-compliant, which would turn the tunnel into a connection stream. Patches welcome.
William, since you're addressing me personally: We've move to a VPN-based solution long ago, so I don't need this functionality anymore. However, there several comments votes by other people, and I also occasionally get mails from people asking me how I worked around this problem. So, there still seems to be some demand for this feature out there. Best regards, Hans
If you can remember that the 100-byte request carried 100 byte content-length, or was definitely larger, that would help. Anyone; if you have the opportunity to sniff the httpd -> backend connection and post what leads to this hang on the near side of the conversation, that would be great.
William, actually, there were(*) two parallel HTTP requests, one for traffic from Outlook to Exchange and one for traffic from Exchange to Outlook. The "upstream" request had a Content-Length header of about 2 GB. The initial ticket description (and my other comments from 2006) contain the headers of both requests as well as an analysis why Apache's buffering causes a deadlock on Microsoft's - well - creative RPC over HTTP implementation. (*) That was Outlook 2003 with Exchange 2003. I never checked for Outlook 2007/2010 and Exchange 2008. Best regards, Hans
(In reply to comment #13) > The "upstream" request had a Content-Length header of about 2 GB. Oh, actually 1 GB, not 2 GB.
Ok, so that's a 1GB 'open ended' pipe, and it expected synchronicity which it's absolutely not allowed to do. Thanks for clarifying, that's what I thought you were getting at. Will ponder interesting solutions, but in the interim it is HTTP/1.1 abuse.
Still/now 'new'
I don't think it's HTTP 1.1 abuse. I can't find anything in RFC2068 that speaks to this point one way or the other. I don't think it's required for mod_proxy to implement this, but it certainly would not be RFC-violating to do so. Suppose a back-end server sends a response with content-length: 1000 but only sends 100 bytes. In that case mod_proxy transmits the 100 bytes to the requestor and waits for further data. But if the incoming request has content-length: 1000 but only sends 100 bytes, mod_proxy just sits there. Why not open the request to the back-end right away? It would improve performance even on regular GET requests, because you're making use of time that would otherwise be wasted on network latency to get the back-end connection open, which means you'll be able to generate a response that much faster. If you have all the HTTP headers and part of the request data, what's the benefit in *not* starting the connection to the back-end, since you know you're going to need it?
Also, regarding vulnerability CVE-2005-2088, surely this can be solved by improving the header parsing rather than by destroying useful functionality that people were using.
I understand the ethical reasons for wanting to implement this, but it would be nice to have perhaps some override options we could specify in a per <Directory> context. I was initially thinking of pre-mangling the Content-Length header, but I doubt this would consistently provide the desired result. Apache does have other "workarounds" for goofy non-compliant stuff...
(In reply to comment #19) > I understand the ethical reasons for wanting to implement this, but it would be This should be "for *not* wanting". :)
>> Why not open the request to the back-end right away? >> It would improve performance even on regular GET requests In fact, it does nothing of the kind, it increases the contention for the backend servers. The current behavior is correct for taking the stress of the much more computationally intensive backend applications. HTTP/1.1 Content-Length description is prescriptive of the server's behavior, and this incorrectly implemented protocol could have *trivially* used the semantically sensible chunked encoding methodology. The one and only hack around HTTP/1.1 non-compliance is to open a connection oriented stream and distrust the entire communications stream.
The question was raised; "Microsoft released a spec" In this case, only the IETF defines HTTP. If it complies with HTTP, then anyone is free to build upon it. See the DAV spec for one example. If it fails to comply with HTTP, it isn't HTTP, and the ASF HTTP Server project is unlikely to pay attention; *particularly* if it masquerades as HTTP and is not.
Final note of the day; I've broached the question on the ietf-http-wg list for pointers to any MS bug or KB references to this misimplementation, and pointers to where users can raise the issue. I have yet to hear back, but when I do I'll update this report. In the interim, after lengthy consideration, this is not an httpd proxy flaw.
During my work at Astaro, I wrote an Apache module mod_proxy_msrpc that intends to work around the mentioned limitations of Apache httpd by switching to a transparent tunnel mode (similar to mod_proxy_connect) as soon as the RPC connection has been successfully negotiated between client and server. It is available on GitHub here: https://github.com/bombadil/mod_proxy_msrpc