Bug 45753 - Code contribution: BurstFilter for extras
Summary: Code contribution: BurstFilter for extras
Status: NEW
Alias: None
Product: Log4j - Now in Jira
Classification: Unclassified
Component: Companions (show other bugs)
Version: 1.2
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: log4j-dev
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-06 10:09 UTC by Chad LaVigne
Modified: 2012-05-26 13:19 UTC (History)
1 user (show)



Attachments
Logging filter to regulate log message bursts (3.65 KB, application/zip)
2008-09-06 10:09 UTC, Chad LaVigne
Details
Update to BurstFilter contribution (4.80 KB, application/zip)
2009-03-23 14:56 UTC, Chad LaVigne
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chad LaVigne 2008-09-06 10:09:21 UTC
Created attachment 22533 [details]
Logging filter to regulate log message bursts

I've written a filter to regulate logging traffic bursts that I'd like to have considered for contribution.  It's useful for situations where you may want to limit the number of logging messages that are sent to an appender, i.e. when the database goes down you don't want 2 million e-mail logs taking out the mail server.  It looks like no further development is planned for 1.2 except for bug fixes and minor enhancements.  Is this feature small enough to be considered a minor enhancement or should I submit the filter for 2.0?

Thanks,
Chad
Comment 1 Curt Arnold 2008-10-09 17:09:14 UTC
First, some picky things:

The license headers in the source code should conform to http://www.apache.org/legal/src-headers.html.  If you could fix the headers and resubmit.

@author tags are no frowned upon in ASF since it may result in territoriality.  We haven't removed them from log4j, but we are avoiding adding new ones.

Many methods are missing javadoc comments.

---

I guess the current AsyncAppender within blocking=false would be classified as a leaky-bucket where the leak rate is the sustainable rate of underlying appender.  It does have the advantage of summarizing the discard messages and reporting the highest error level that was not reported.  What are your thoughts on adding a configurable strategy for throttling on AsyncAppender?

The burst filter appears to discard events with no potential for summarizing or being informed of event loss.

It might be desirable to make this level aware, so you would apply this to DEBUG messages but INFO and higher might pass through unaffected.

I'd likely collapse BurstFilter and TokenBucket and rename it TokenBucketFilter.

TokenBucket.getToken()'s has an unexpected return value.  You'd expect that it would return true if there were available tokens, but it returns true if empty.
Comment 2 Chad LaVigne 2009-03-23 14:56:31 UTC
Created attachment 23401 [details]
Update to BurstFilter contribution

The attached code addresses the following issues mentioned by Curt.  

1) License headers are fixed
2) @Author tags removed
3) JavaDoc comments added to all methods
4) Filter is now level aware
5) BurstFilter & TokenBucket have been condensed into one class
6) TokenBucket.getToken() now returns false if empty 

I didn't rename the filter because I actually submitted this code about 2 years ago under the name TokenBucketFilter and it was requested at that time that I rename it to something like BurstFilter so that it would be implementation agnostic.  Let me know what the consensus is, I can certainly rename it if need be.
Comment 3 Chad LaVigne 2009-04-01 11:42:05 UTC
Is there any feedback on this contribution?

(In reply to comment #2)
> Created an attachment (id=23401) [details]
> Update to BurstFilter contribution
> 
> The attached code addresses the following issues mentioned by Curt.  
> 
> 1) License headers are fixed
> 2) @Author tags removed
> 3) JavaDoc comments added to all methods
> 4) Filter is now level aware
> 5) BurstFilter & TokenBucket have been condensed into one class
> 6) TokenBucket.getToken() now returns false if empty 
> 
> I didn't rename the filter because I actually submitted this code about 2 years
> ago under the name TokenBucketFilter and it was requested at that time that I
> rename it to something like BurstFilter so that it would be implementation
> agnostic.  Let me know what the consensus is, I can certainly rename it if need
> be.
Comment 4 Chad LaVigne 2011-09-30 18:40:29 UTC
I opened this issue quite a while ago and never received feedback after making the changes initially requested. I've found this filter useful on quite a few projects and each time I add it to a new one I find myself again wishing it was just part of the main log4j distribution.  Is there any chance that this filter will get added to the project or is it something the Log4J team just isn't interested in?
Comment 5 Ralph Goers 2011-10-02 02:33:15 UTC
I tried adding this to my Log4j 2.0 branch but am having some problems. It isn't passing the unit test. It seems to me that when the filter is called it does nothing until fillInterval has expired. Since the first until test runs right away it is successfully writing 110 records instead of 100.

I'm also concerned with the synchronization as it could cause a bottleneck.
Comment 6 Ralph Goers 2011-10-02 07:26:37 UTC
I implemented a BurstFilter that behaves a bit differently that yours to the Log4j 2.0 experimental branch.  It will behave a bit differently than yours in that it only needs the interval and queue size. You can find it at https://svn.apache.org/repos/asf/logging/log4j/branches/BRANCH_2_0_EXPERIMENTAL/rgoers/log4j2-core/src/main/java/org/apache/logging/log4j/core/filter/BurstFilter.java
Comment 7 Chad LaVigne 2011-10-02 17:34:16 UTC
Hi Ralph,

The unit test should work, although it's a little hard to read the results. You should see 100 of 110 info log statements print with the first test, then there's a second test that does 110 fatal log statements that should all print because they are fatals and finally you should see 30 new info logs print because the test waits 18 seconds and should then print 30 info statements (the burst recovery rate is 10 log statements every 6 seconds).  So, is the test printing all 110 info statements for you?  You see a line like this that has "#110" at the end?

INFO  02-Oct-2011 12:23:27  main Logging 110 messages, should only see 100 logs
# 110

It's a little tough to notice the statement number, but you shouldn't be seeing that, it the last info statement that prints should be #100, then you should see 110 fatals, then 30 infos.  Anyway, it would probably be better if we had an actual JUnit test that had asserts but I figured we didn't want to add a dependency on JUnit.  Is that true or would a JUnit test be preferred?

As for the synchronization, I think it's necessary to ensure that we don't exceed maxTokens, right?

(In reply to comment #5)
> I tried adding this to my Log4j 2.0 branch but am having some problems. It
> isn't passing the unit test. It seems to me that when the filter is called it
> does nothing until fillInterval has expired. Since the first until test runs
> right away it is successfully writing 110 records instead of 100.
> 
> I'm also concerned with the synchronization as it could cause a bottleneck.
Comment 8 Chad LaVigne 2011-10-02 17:42:04 UTC
Thank you for adding this, I thought my contribution might be a lost cause. So it looks like this is basically the same filter except that you no longer define the fill amount, i.e. you say add tokens every 10 seconds but you don't say how many to add every 10 seconds, correct?  I like that simplification, but I do kind of like having the ability to configure the fill rate too. Maybe we could allow the fill rate to still be configured but just have a sensible default?

(In reply to comment #6)
> I implemented a BurstFilter that behaves a bit differently that yours to the
> Log4j 2.0 experimental branch.  It will behave a bit differently than yours in
> that it only needs the interval and queue size. You can find it at
> https://svn.apache.org/repos/asf/logging/log4j/branches/BRANCH_2_0_EXPERIMENTAL/rgoers/log4j2-core/src/main/java/org/apache/logging/log4j/core/filter/BurstFilter.java
Comment 9 Ralph Goers 2011-10-02 19:52:01 UTC
If you've looked at the code you will see that it is using a DelayQueue to control how many log entries can be written. From what I could tell in looking at your code your bucket was essentially a bucket with a fixed window and you remove x tokens every period making that the maximum that can be logged. With the DelayQueue items will expire after the maxBurst period opening up slots for that many new log events.  

To give an example, assuming I am understanding correctly, with your algorithm with a window size of 100, an interval of 6 and an amount of 10 you would see a maximum of 100 events in the first 6 seconds, 10 the next 6, 10 the next six, etc.  With the implementation I've used you will see 100 events the first 6 seconds but what happens after that is a bit variable. Say the records arrived as 10 the first second, 30 the second, 25 the third, 25 the fourth, 30 the fifth (20 are ignored), and 25 in the last second (all of which are ignored) then in the seventh second 10 new items would be allowed, 30 in the eighth, and 25 in the ninth, etc. It would be even more fluid than that since the granularity of items in the queue is a nanosecond.  

So although you don't specify how many can be added after the queue has filled, since the maximum number of items is specified and can never be exceeded and since the queue is constantly emptying as items expire, I wouldn't think it would be necessary to specify another number.  

By the way, yes I am using Junit for Log4j 2.0. I converted your test to log to a ListAppender instead of the console and that was when I couldn't get it to pass the tests. It may have been something I did wrong.

I should emphasize that I didn't add this to Log4j 1.2 as I'm solely focused on building 2.0 and I thought this would make a nice addition. My version can't be added to Log4j 1.2 as the DelayQueue requires Java 5.
Comment 10 John Vasileff 2011-10-03 00:57:30 UTC
I wonder if it would be easier for end users if the configuration was made with rate instead of burstInterval:

float rate; // average number of log statements allowed per second
int maxBurst; // maximum burst of log statements

So, if I have this right, internally:

burstIntervalNanos = maxBurst / rate * NANOS_IN_SECONDS

MaxBurst could be made optional with a default something like Math.max(10, 2*rate).
Comment 11 Ralph Goers 2011-10-03 02:16:01 UTC
I agree and have made the suggested changes. If the max is left off it defaults to 10 times the rate.
Comment 12 John Vasileff 2011-10-03 17:53:02 UTC
I took a look at BurstFilter and have a few thoughts:

1) The calculation of burstInterval loses precision.  Better may be to store burstIntervalNanos with something like:

this.burstIntervalNanos = NANOS_IN_SECONDS * max / rate;

Currently with rate=16 and maxBurst=100, the burstInterval becomes 6L seconds instead of 6.25F seconds, so the effective rate is 100F/6 == 16.667.  A much worse scenario exists when bursting is disabled.  For example, rate=2 and maxBurst=1, the effective rate is infinite (burstInterval = 1/2 == 0).  More generally, burstInterval becomes 0 when rate > maxBurst.

2) It may be valuable to define rate as a float to allow fractional rates.  One example is a job processor that logs trace("pending jobs: " + queue.size()) whenever a job is processed.  The deployer may want to monitor load throughout the day, but no more frequently than once every 100s.  This could be accomplished with rate=0.01, max=1.

3) It probably makes sense to restrict max to between 1 and 1k or something to avoid overflow errors and unnecessary memory/processing overhead.  And of course if burstIntervalNanos <= 0, filtering can be disabled.
Comment 13 Ralph Goers 2011-10-03 20:12:36 UTC
John, I encourage you to open a separate Jira issue for Log4J 2 and submit a patch with how you'd like this to work and enhance the unit test as well. Bugzilla really is for Log4j 1.x.
Comment 14 John Vasileff 2011-10-03 22:25:21 UTC
Ralph, I believe the bug fix for #1 is either as shown in comment 12 (nanos first) or comment 10 (use float for rate).  Items #2 and #3 are just ideas.  I'm not sure it makes sense for me to dig further into this since you are already on the 10 yard line, but if you still feel otherwise I'll try to provide a Jira patch when I find some time.

Hopefully the above discussion will help out if BurstFilter (in some form) is added to Log4j 1.x, but if I have further ideas I'll post elsewhere.
Comment 15 Ralph Goers 2011-10-21 13:51:52 UTC
I modified BurstFilter in 2.0 as you suggested.