Bug 61222 - Apache consumes too much memory for CGI output.
Summary: Apache consumes too much memory for CGI output.
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: Core (show other bugs)
Version: 2.4.26
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: FixedInTrunk, PatchAvailable
Depends on:
Blocks:
 
Reported: 2017-06-26 03:19 UTC by Tetsuo Handa
Modified: 2017-10-10 17:59 UTC (History)
1 user (show)



Attachments
rip out buffering in content length filter (2.73 KB, patch)
2017-09-11 17:21 UTC, Joe Orton
Details | Diff
Send all read data down the chain (2.83 KB, patch)
2017-09-12 07:21 UTC, Ruediger Pluem
Details | Diff
Modified rpluem's patch with more FLUSH (3.37 KB, patch)
2017-09-12 10:29 UTC, Joe Orton
Details | Diff
iteration #4 (4.12 KB, patch)
2017-09-12 12:25 UTC, Joe Orton
Details | Diff
iteration #5 (5.04 KB, patch)
2017-09-12 15:49 UTC, Joe Orton
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tetsuo Handa 2017-06-26 03:19:42 UTC
When an CGI program generates huge amount of output (magnitude of many gigabytes),
the system becomes unresponsive due to memory consumption by Apache worker process
which executed that CGI program. The Apache worker process should not consume too
much memory trying to spool the output from the CGI program. Even if spooling is
necessary for some reason, it should not consume too much memory and it should
free the memory after current request completed.

Original report (including steps to reproduce) is at https://bugzilla.redhat.com/show_bug.cgi?id=1464406 .

I report here because I confirmed that this problem remains using httpd-2.4.26.tar.bz2 .
Comment 1 Tetsuo Handa 2017-09-08 03:39:01 UTC
Ping? This bug can easily trigger out of memory denial of service.
Comment 2 Joe Orton 2017-09-11 17:06:41 UTC
I can't really understand how this has regressed, since we fixed this bug several times in the 2.0/2.2 era!

ap_content_length_filter seems completely wrong here, and does not try to limit the number of buckets it morphs into HEAP where e->length == -1.  With a fast enough CGI script where the CGI buckets can morph into HEAP as fast as you read them (i.e. you never hit EAGAIN), it'll try to consume as much RAM as it can.  I'm not sure how this ever worked.
Comment 3 Joe Orton 2017-09-11 17:21:22 UTC
Created attachment 35316 [details]
rip out buffering in content length filter

I feel like this code should work harder to justify its existence, or else we should rip it out.

If the server should buffer some output specifically to try to compute a C-L and avoid chunked output for streamy content generators (e.g. CGI), OK, there's surely an argument for doing that.  But clearly you need to think hard about limits, which was never done here and somehow we got away with it.

Trying to push buffering into arbitrary filters is annoying and probably wrong unless we are carefully looking at performance trade-offs.

Anyway, alternative is to rip out the buffering here, and fixes the bug.
Comment 4 Ruediger Pluem 2017-09-12 07:20:55 UTC
(In reply to Joe Orton from comment #3)
> Created attachment 35316 [details]
> rip out buffering in content length filter
> 
> I feel like this code should work harder to justify its existence, or else
> we should rip it out.
> 
> If the server should buffer some output specifically to try to compute a C-L
> and avoid chunked output for streamy content generators (e.g. CGI), OK,
> there's surely an argument for doing that.  But clearly you need to think
> hard about limits, which was never done here and somehow we got away with it.
> 
> Trying to push buffering into arbitrary filters is annoying and probably
> wrong unless we are carefully looking at performance trade-offs.
> 
> Anyway, alternative is to rip out the buffering here, and fixes the bug.

Understand that you are upset and I am quite surprised how old this code is and that we never got hit by it. Just a comment to the patch: It breaks the stats of r->bytes_sent.
How about passing everything down the chain once we did an apr_bucket_read no matter if  it was APR_SUCCESS or or EAGAIN?
Comment 5 Ruediger Pluem 2017-09-12 07:21:58 UTC
Created attachment 35317 [details]
Send all read data down the chain

Like the following?
Comment 6 Joe Orton 2017-09-12 10:29:17 UTC
Created attachment 35321 [details]
Modified rpluem's patch with more FLUSH

Thanks a lot for looking at this Ruediger!  You're correct of course on r->bytes_sent, which I completely missed...

So I agree it makes sense to keep the loop, but I think we should keep the FLUSH bucket insertion for the EAGAIN case - or is there a good reason to remove it?  Hence I'd modify your patch slightly to the attached.
Comment 7 Joe Orton 2017-09-12 10:31:36 UTC
I also added the APR_ECONNABORTED logic here is which orthogonal to this change so could be committed separately.

+                else if (f->c->aborted) {
+                    return APR_ECONNABORTED;
+                }
Comment 8 Yann Ylavic 2017-09-12 11:15:16 UTC
(In reply to Ruediger Pluem from comment #5)
> Created attachment 35317 [details]
> Send all read data down the chain
> 
> Like the following?

Looks like we wouldn't pass the "pipe" bucket itself down the stack (eg. e == APR_BRIGADE_FIRST(b) and rv == EAGAIN), and then keep reading it until EOS in ap_content_length_filter(), or am I missing something?
Do we really want C-L that much nowadays?

If we pass the "pipe" down the stack, I think the ssl or core output filter would do the the right thing already, but we need to make sure that ap_save_brigade() isn't called either or the unbounded heap buckets would still be there...
Comment 9 Yann Ylavic 2017-09-12 11:30:43 UTC
(In reply to Yann Ylavic from comment #8)
> Do we really want C-L that much nowadays?

Scratch this, the C-L is not the point whenever we pass anything before EOS.
Still looks like we duplicate some logic belonging in ssl or core filters here, is it worth it?
Comment 10 Ruediger Pluem 2017-09-12 11:40:50 UTC
(In reply to Yann Ylavic from comment #9)
> (In reply to Yann Ylavic from comment #8)
> > Do we really want C-L that much nowadays?
> 
> Scratch this, the C-L is not the point whenever we pass anything before EOS.
> Still looks like we duplicate some logic belonging in ssl or core filters
> here, is it worth it?

Agreed. We should find a better way / place to count the bytes stats, but for now this fixes an actual issue.
Comment 11 Yann Ylavic 2017-09-12 11:48:07 UTC
Sure, I'm fine with the fix, let's work on this afterwards ;)
Comment 12 Ruediger Pluem 2017-09-12 11:53:04 UTC
(In reply to Joe Orton from comment #6)
> Created attachment 35321 [details]
> Modified rpluem's patch with more FLUSH
> 
> Thanks a lot for looking at this Ruediger!  You're correct of course on
> r->bytes_sent, which I completely missed...
> 
> So I agree it makes sense to keep the loop, but I think we should keep the
> FLUSH bucket insertion for the EAGAIN case - or is there a good reason to
> remove it?  Hence I'd modify your patch slightly to the attached.

My initial thought on the flush was to remove it in order to give the downstream filters some freedom whether they need to flush or can buffer (at least parts) for whatever optimization, but now I agree that it makes sense to flush when we probably block next.

But looking at my own an your patch additional points come up:

1. Don't we need to handle flush buckets and do flushing nevertheless what happens else?
2. The if (e->length == (apr_size_t)-1)  seems stupid as it would also handle META_BUCKETS trying to read them. Shouldn't we just skip them in the loop (with the exception of the flush buckets above)?
Comment 13 Joe Orton 2017-09-12 12:25:55 UTC
Created attachment 35322 [details]
iteration #4

I think any complete content filter ends up with 99% the same code +/- ten different bugs. :(

Metadata buckets *should* always be length=0 I think, though I'm not sure that's an API contract anywhere.  Anyway it's easy to side-step that.

You're definitely right we should handle FLUSH.  Another iteration here...
Comment 14 Joe Orton 2017-09-12 12:30:38 UTC
Possible micro-optimisation here, if we get a brigade like:

[CGI FLUSH EOR]

and that CGI bucket transforms entirely into a HEAP on first read (i.e. a small fast CGI script) you then get

[HEAP FLUSH EOR]

and we should be able to do that with a single ap_pass_brigade() call, and still compute a C-L.
Comment 15 Joe Orton 2017-09-12 14:22:37 UTC
Actually, scratch previous comment, that works already.

ap_scan_script_header* will have read from the CGI bucket already before we reach the C-L filter.  So if the output from the CGI script fits into 8K on first entry to the C-L filter we get a brigade like:

[HEAP(response data) CGI EOS]

The CGI bucket morphs to a zero-length IMMORTAL when you read EOF from it, after passing through the loop twice the brigade is:

[HEAP(response data) IMMORTAL("") EOS]

...and we fall through to the final ap_pass_brigade after fixing C-L.
Comment 16 Joe Orton 2017-09-12 14:43:18 UTC
That's only true because I re-introduced the original bug in iteration #4 :(  Iteration #5 coming up...
Comment 17 Joe Orton 2017-09-12 15:49:54 UTC
Created attachment 35323 [details]
iteration #5

I rewrote the loop logic to be more explicit (I hope) in the handling of different bucket types, at least I can understand it properly now.

Three specific cases I'm testing:

1) a CGI script which writes the header, sleeps, writes the body -- must see headers hit the network before waiting to read the body

2) the bug reported here -- must see no buffering

3) for a small fast CGI script with <8K output -- does get C-L on response

... plus running the test suite.

I'm worried there are specific response brigades with FLUSH this is going to fail to produce C-L where we previously did, e.g. as simple as:

HEAP FLUSH EOS

will produced chunked output now.

I think we could safely handling this by moving the "short-cut" case for EOS right before the apr_brigade_split_ex call?
Comment 18 Ruediger Pluem 2017-09-13 06:10:59 UTC
(In reply to Joe Orton from comment #17)
> Created attachment 35323 [details]
> iteration #5
> 
> I rewrote the loop logic to be more explicit (I hope) in the handling of
> different bucket types, at least I can understand it properly now.
> 
> Three specific cases I'm testing:
> 
> 1) a CGI script which writes the header, sleeps, writes the body -- must see
> headers hit the network before waiting to read the body
> 
> 2) the bug reported here -- must see no buffering
> 
> 3) for a small fast CGI script with <8K output -- does get C-L on response
> 
> ... plus running the test suite.
> 
> I'm worried there are specific response brigades with FLUSH this is going to
> fail to produce C-L where we previously did, e.g. as simple as:
> 
> HEAP FLUSH EOS
> 
> will produced chunked output now.
> 
> I think we could safely handling this by moving the "short-cut" case for EOS
> right before the apr_brigade_split_ex call?

Agreed. Cool patch.

Nitpick:
I think we should call apr_brigade_cleanup(ctx->tmpbb) before leaving if the ap_pass_brigade fails in the loop ( (rv != APR_SUCCESS) /  (f->c->aborted)). This should free up memory faster.
Comment 19 Joe Orton 2017-09-13 11:04:59 UTC
I think the patch which removed 40 lines of code was cooler :(

The brigade cleanup is done for r->pool anyway, I dislike adding explicit cleanups when we can let the pools do their job!

I committed this in r1808230.

There are definitely cases we could further optimise here.  e.g. where the entire brigade length is determinable, we could find C-L without an intermediate ap_pass_brigade.  [HEAP FLUSH HEAP EOS] is an example where we'll now chunk when we could determine C-L.

I don't know if it's worth the effort to optimise that; arguably content generators which produce excessive FLUSH buckets should be treated as streamy producers.
Comment 20 Ruediger Pluem 2017-09-13 11:21:06 UTC
(In reply to Joe Orton from comment #19)
> I think the patch which removed 40 lines of code was cooler :(
> 
> The brigade cleanup is done for r->pool anyway, I dislike adding explicit
> cleanups when we can let the pools do their job!

Fair enough.

> 
> I committed this in r1808230.
> 
> There are definitely cases we could further optimise here.  e.g. where the
> entire brigade length is determinable, we could find C-L without an
> intermediate ap_pass_brigade.  [HEAP FLUSH HEAP EOS] is an example where
> we'll now chunk when we could determine C-L.
> 
> I don't know if it's worth the effort to optimise that; arguably content
> generators which produce excessive FLUSH buckets should be treated as
> streamy producers.

I don't think that it is worth the effort. These generators should be considered streamy producers as you say. I also doubt that we see [HEAP FLUSH HEAP EOS] in practice. More likely we see [HEAP FLUSH] which resulted in chunked encoding so far as well.
Comment 21 Ruediger Pluem 2017-09-13 11:22:13 UTC
(In reply to Tetsuo Handa from comment #1)
> Ping? This bug can easily trigger out of memory denial of service.

Can you please try the latest patch and report back?
Comment 22 Ruediger Pluem 2017-09-13 11:23:34 UTC
(In reply to Ruediger Pluem from comment #21)
> (In reply to Tetsuo Handa from comment #1)
> > Ping? This bug can easily trigger out of memory denial of service.
> 
> Can you please try the latest patch and report back?

I mean the one from r1808230.
Comment 23 Tetsuo Handa 2017-09-14 02:29:34 UTC
(In reply to Ruediger Pluem from comment #22)
> (In reply to Ruediger Pluem from comment #21)
> > (In reply to Tetsuo Handa from comment #1)
> > > Ping? This bug can easily trigger out of memory denial of service.
> > 
> > Can you please try the latest patch and report back?
> 
> I mean the one from r1808230.

The patch solves this bug. Thank you.


My client side program which calls the CGI programs depends on Content-Length:
field available in order to show progress bar, and I worried

> but this DOES change behaviour and some responses will end up
> chunked rather than C-L computed.

part. But as far as I tested, it seems to me that the patch does not change
behavior and responses will not end up chunked as long as the CGI programs
explicitly emit Content-Length: field. Thus, I'm OK with the patch.
Comment 24 Joe Orton 2017-09-14 07:00:56 UTC
(In reply to Tetsuo Handa from comment #23)
> But as far as I tested, it seems to me that the patch does not change
> behavior and responses will not end up chunked as long as the CGI programs
> explicitly emit Content-Length: field. Thus, I'm OK with the patch.

Yes, if C-L is set explicitly that will always get passed through to the client, this patch won't make any difference there.

Thanks a lot for testing it out!
Comment 25 Joe Orton 2017-10-10 17:59:34 UTC
Merged to 2.4.x -- r1811746