Log4cxx
  1. Log4cxx
  2. LOGCXX-161

Using RollingFileAppender increases the working set with each rollover

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.10.0
    • Component/s: Appender
    • Labels:
      None

      Description

      There seems to be a memory leak in APR 1.2.7 (see http://issues.apache.org/bugzilla/show_bug.cgi?id=40955) which can be reproduced using RollingFileAppenders:

      I have observed that our application increases its working set in 8k steps. I have tracked it down to happen whenever the log file rolls over:

      log4cxx::rolling::RollingFileAppender::rollover() (Line 153) calls WriterAppender::closeWriter() which allocates a new APR pool. Before this pool is freed, a call
      to allocator_free() comes from a FileOutputStream destructor, followed by the allocator_free() call for the closeWriter pool. Because of the above APR bug, only one of these pools is deallocated...

      Perusing the guide lines for APR pools (http://svn.apache.org/viewvc/apr/apr/trunk/docs/pool-design.html?view=co), I suppose that closeWriter doesn't need its own pool ("Objects should not have their own pools."). When I changed the log4cxx source accordingly, I couldn't reproduce the leak anymore. (UPDATE: that's not true)

      1. log4cxx-161.patch
        2 kB
        Peter Steiner
      2. log4cxx-161-mutex.patch
        2 kB
        Peter Steiner
      3. log4cxx-161-threshold.patch
        2 kB
        Peter Steiner

        Activity

        Hide
        Peter Steiner added a comment -

        Patch that changes WriterAppender::closeWriter to take a pool as parameter instead of allocating its own pool

        Show
        Peter Steiner added a comment - Patch that changes WriterAppender::closeWriter to take a pool as parameter instead of allocating its own pool
        Hide
        Peter Steiner added a comment -

        Allows to allocate the mutex of FileRenameAction in a specified pool instead of using APRInitializer::getRootPool() and does so in FixedWindowRollingPolicy

        Show
        Peter Steiner added a comment - Allows to allocate the mutex of FileRenameAction in a specified pool instead of using APRInitializer::getRootPool() and does so in FixedWindowRollingPolicy
        Hide
        Peter Steiner added a comment -

        log4cxx-161-threshold.patch:

        • Allows to set the threshold of unused pool memory at which the APR allocator starts giving back memory blocks.
        • Sets a default threshold just after initialisation
        Show
        Peter Steiner added a comment - log4cxx-161-threshold.patch: Allows to set the threshold of unused pool memory at which the APR allocator starts giving back memory blocks. Sets a default threshold just after initialisation
        Hide
        Peter Steiner added a comment -

        After more tests I realized that log4cxx-161.patch didn't fix the problem. It's more an optimisation (don't allocate a new pool when you already have one that you can use).

        The most important patch in this issue is IMHO log4cxx-161-mutex.patch:

        There are (were) some minor issues with apr_pools.c (see the apr-dev archive http://mail-archives.apache.org/mod_mbox/apr-dev/200611.mbox/%3c456D4F91.9060806@hugwi.ch%3e and the APR issues http://issues.apache.org/bugzilla/show_bug.cgi?id=41063 and http://issues.apache.org/bugzilla/show_bug.cgi?id=40955), but these have nothing to do with my application increasing its working set.

        I found that the Mutex that FileRenameAction uses is allocated out of the log4cxx root pool. The root pool is only destroyed when the application terminates, so memory allocated in this pool is never really given back.

        As every rollover creates a couple of FileRenameActions (up to rollingPolicy.MaxIndex instances), every rollover in fact increases the working set. And my application is designed to run for days or months and do many rollovers in this time...

        The third patch here (log4cxx-161-threshold.patch) is not strictly necessary. It helps you keeping the working set smaller. Without this patch no pools are ever given back. They are kept in a free-list and reused for the next matching pool creation, but not given back. This patch lets you specify a limit of unused pools kept in the list - if there are more unused pools in the list, they are given back.

        Show
        Peter Steiner added a comment - After more tests I realized that log4cxx-161.patch didn't fix the problem. It's more an optimisation (don't allocate a new pool when you already have one that you can use). The most important patch in this issue is IMHO log4cxx-161-mutex.patch: There are (were) some minor issues with apr_pools.c (see the apr-dev archive http://mail-archives.apache.org/mod_mbox/apr-dev/200611.mbox/%3c456D4F91.9060806@hugwi.ch%3e and the APR issues http://issues.apache.org/bugzilla/show_bug.cgi?id=41063 and http://issues.apache.org/bugzilla/show_bug.cgi?id=40955 ), but these have nothing to do with my application increasing its working set. I found that the Mutex that FileRenameAction uses is allocated out of the log4cxx root pool. The root pool is only destroyed when the application terminates, so memory allocated in this pool is never really given back. As every rollover creates a couple of FileRenameActions (up to rollingPolicy.MaxIndex instances), every rollover in fact increases the working set. And my application is designed to run for days or months and do many rollovers in this time... The third patch here (log4cxx-161-threshold.patch) is not strictly necessary. It helps you keeping the working set smaller. Without this patch no pools are ever given back. They are kept in a free-list and reused for the next matching pool creation, but not given back. This patch lets you specify a limit of unused pools kept in the list - if there are more unused pools in the list, they are given back.
        Hide
        Curt Arnold added a comment -

        Looks like it needs more research to me. If the log4cxx code is okay and it is an APR bug that has been fixed, then it would be best to upgrade to the newer APR version.

        Show
        Curt Arnold added a comment - Looks like it needs more research to me. If the log4cxx code is okay and it is an APR bug that has been fixed, then it would be best to upgrade to the newer APR version.
        Hide
        Peter Steiner added a comment -

        My point is that the log4cxx code for FileRenameAction is not okay. It needs a pool reference, so that its base class Action can use this pool for its mutex (instead of the root pool). This is fixed with log4cxx-161-mutex.patch

        I first thought that the APR issues were responsible for my problems, but after my research I found out that they are unrelated.

        The other two patches are not that important or maybe even unnecessary:

        • log4cxx-161.patch avoids a Pool creation and is an optimisation.
        • The threshold set by log4cxx-161-threshold.patch would be needed only if the pool usage pattern of log4cxx would have peaks. I observed while debugging that pools are never freed. I then realised that in my application this doesn't matter, because the pools that are given back are just reused. The pools that are not given back (like the mutex pool of FileRenameAction) are not affected anyway.
        Show
        Peter Steiner added a comment - My point is that the log4cxx code for FileRenameAction is not okay. It needs a pool reference, so that its base class Action can use this pool for its mutex (instead of the root pool). This is fixed with log4cxx-161-mutex.patch I first thought that the APR issues were responsible for my problems, but after my research I found out that they are unrelated. The other two patches are not that important or maybe even unnecessary: log4cxx-161.patch avoids a Pool creation and is an optimisation. The threshold set by log4cxx-161-threshold.patch would be needed only if the pool usage pattern of log4cxx would have peaks. I observed while debugging that pools are never freed. I then realised that in my application this doesn't matter, because the pools that are given back are just reused. The pools that are not given back (like the mutex pool of FileRenameAction) are not affected anyway.
        Hide
        Curt Arnold added a comment -

        Thanks for clarification. I wasn't particularly comfortable with the patch since it was possible that the Action might have a longer life that the Pool passed on the rollover method. I took a different tack on the problem and hope I have something that will address your concerns. I would appreciate your testing and feedback.

        First, it was definitely a bad thing to create transient mutexes in the root pool. The Mutex default constructor made it very easy to do the wrong thing, so I removed it and then followed what unravelled. For classes containing mutex members where the object lifetime was obviously contained within the lifetime of an method call or the lifetime of an aggregating object, I added a Pool& argument to the constructor. Logger's fell into this category, so now their mutexes are in the hierarchy's pool. If the Pool passed as a parameter isn't longer-lived, bad things will happen.

        For classes containing mutex members where there was no obvious source of a longer-lived pool, a pool member was added to the class. Action fell into this category, so now creating an Action will create a sub-pool of the root pool and the mutex will be created within the sub-pool. The Action destructor will destroy the sub-pool which should reclaim the mutex. I'd expect (but have not tested), that you could create Action's repeatedly and not increase the working set.

        Committed changes in rev 494914.

        Show
        Curt Arnold added a comment - Thanks for clarification. I wasn't particularly comfortable with the patch since it was possible that the Action might have a longer life that the Pool passed on the rollover method. I took a different tack on the problem and hope I have something that will address your concerns. I would appreciate your testing and feedback. First, it was definitely a bad thing to create transient mutexes in the root pool. The Mutex default constructor made it very easy to do the wrong thing, so I removed it and then followed what unravelled. For classes containing mutex members where the object lifetime was obviously contained within the lifetime of an method call or the lifetime of an aggregating object, I added a Pool& argument to the constructor. Logger's fell into this category, so now their mutexes are in the hierarchy's pool. If the Pool passed as a parameter isn't longer-lived, bad things will happen. For classes containing mutex members where there was no obvious source of a longer-lived pool, a pool member was added to the class. Action fell into this category, so now creating an Action will create a sub-pool of the root pool and the mutex will be created within the sub-pool. The Action destructor will destroy the sub-pool which should reclaim the mutex. I'd expect (but have not tested), that you could create Action's repeatedly and not increase the working set. Committed changes in rev 494914.

          People

          • Assignee:
            Curt Arnold
            Reporter:
            Peter Steiner
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development