Log4net
  1. Log4net
  2. LOG4NET-376

Race condition in AbsoluteTimeDateFormatter

    Details

    • Type: Bug Bug
    • Status: Reopened
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.2.11
    • Fix Version/s: 1.2.13, 1.3.0
    • Component/s: None
    • Labels:
      None

      Description

      AbsoluteTimeDateFormatter's caching of the "to the second" timestamp string is not thread-safe. It is possible for one thread to clear the check (that this timestamp matches the currently cached "to the second" timestamp), but then end up using an incorrect "to the second" timestamp string if another thread has changed it in the meantime.

      In our organization, we see this bug fairly regularly because we have a mix of "real time" loggers that immediately write out log lines and "batching" loggers that defer logging to a background task that runs every second. We therefore regularly see log lines where the timestamp is off by a second or two.

      The following unit tests demonstrates the bug:

      [TestFixture]
      [Explicit]
      public class Log4netTimestampBug
      {
      /// <summary>
      /// This test demonstrates a bug with the log4net default time formatter (Iso8601DateFormatter)
      /// where the logged timestamp can be seconds off from the actual input timestamp
      /// The bug is caused to a race condition in the base class AbsoluteTimeDateFormatter
      /// because this class caches the timestamp string to the second but it is possible for
      /// the timestamp as written by a different thread to "sneak in" and be used by another
      /// thread erroneously (the checking and usage of this string is not done under a lock, only
      /// its modification)
      /// </summary>
      [Test]
      public void Test()
      {
      var now = DateTime.Now;
      var times = Enumerable.Range(1, 1000000).Select(i => now.AddMilliseconds).ToList();

      var sb1 = new StringBuilder();
      var sb2 = new StringBuilder();

      var task1 = Task.Run(() => WriteAllTheTimes(times, new StringWriter(sb1)));
      var task2 = Task.Delay(50).ContinueWith(t => WriteAllTheTimes(times, new StringWriter(sb2)));

      Task.WaitAll(task1, task2);

      var task1Strings = GetTimeStrings(sb1);
      var task2Strings = GetTimeStrings(sb2);

      var diffs = Enumerable.Range(0, times.Count).Where(i => task1Strings[i] != task2Strings[i]).ToList();

      Console.WriteLine("found

      {0}

      instances where the formatted timestamps are not the same", diffs.Count);
      Console.WriteLine();

      var diffToLookAt = diffs.FirstOrDefault(i => i - 10 > 0 && i + 10 < times.Count);
      if (diffToLookAt != 0)
      {
      Console.WriteLine("Example Diff:");
      Console.WriteLine();
      Console.WriteLine("Index Original Timestamp Task 1 Format Task 2 Format");
      for (int i = diffToLookAt - 10; i < diffToLookAt + 10; i++)
      {
      Console.WriteLine("

      {0,-7}

      {1}

      {2}

      {3}

      {4}

      ", i, times[i].ToString("yyyy-MM-dd HH:mm:ss,fff"),
      task1Strings[i], task2Strings[i], i == diffToLookAt ? "**** DIFF HERE ****" : "");
      }
      }

      CollectionAssert.AreEqual(task1Strings, task2Strings);
      }

      private static List<string> GetTimeStrings(StringBuilder sb1)
      {
      return sb1.ToString().Split(new[]

      {'\r', '\n'}

      , StringSplitOptions.RemoveEmptyEntries).ToList();
      }

      private static void WriteAllTheTimes(IEnumerable<DateTime> times,
      TextWriter writer)
      {
      var formatter = new Iso8601DateFormatter();
      foreach (var t in times)

      { formatter.FormatDate(t, writer); writer.WriteLine(); }

      }
      }

        Activity

        Hide
        Stefan Bodewig added a comment -

        brought back the big lock with svn revision 1542664

        I'll keep this ticket open as I'd like to look into different approaches to solve it in 1.3.0.

        Show
        Stefan Bodewig added a comment - brought back the big lock with svn revision 1542664 I'll keep this ticket open as I'd like to look into different approaches to solve it in 1.3.0.
        Hide
        Stefan Bodewig added a comment -

        I'm afraid svn revision 1486883 http://svn.apache.org/viewvc/logging/log4net/trunk/src/log4net/DateFormatter/AbsoluteTimeDateFormatter.cs?r1=1483378&r2=1486883 has reverted the locking that Dominik had introduced. I'll bring them back for 1.2.13 and also verify we don't access the dictionary outside of a lock.

        Show
        Stefan Bodewig added a comment - I'm afraid svn revision 1486883 http://svn.apache.org/viewvc/logging/log4net/trunk/src/log4net/DateFormatter/AbsoluteTimeDateFormatter.cs?r1=1483378&r2=1486883 has reverted the locking that Dominik had introduced. I'll bring them back for 1.2.13 and also verify we don't access the dictionary outside of a lock.
        Hide
        Stuart Lange added a comment - - edited

        Hello again. Sorry for letting this slide for a while.

        We've noticed that with the log4net 1.2.12 service release, this bug has actually gotten considerably worse. We have run into scenarios where the "to the second" component of the timestamp gets "stuck" and never updates again for the duration of the application. Unfortunately, I have been unable to produce a unit test that reliably reproduces the issue (the unit test above sometimes reproduces it), but my best guess is that this is related to the addition of the Hashtable in AbsoluteTimeDateFormatter in LOG4NET-323. The Hashtable is cleared and read from outside the lock, which can lead to potential race conditions since the Hashtable is not thread-safe.

        I have created my own implementation of IDateFormatter that behaves identically to Iso8601DateFormatter, but avoids the bugs noted in this issue. My performance testing shows that it performs nearly identically to Iso8601DateFormatter in typical scenarios, and actually performs substantially better in multi-threaded scenarios. I chose to make the cache state ThreadStatic as that performs marginally better than the other options (static, instance) in my performance testing, but the other options also perform well and behave equivalently. Code is below. If you would like to see my performance testing code, I can send you that as well, but it uses some of my own custom components that would require unpacking to post in copy-paste friendly form.

            public class StandardDateFormatter : IDateFormatter
            {
                // Using ThreadStatic is a micro-optimization. Making it static or instance state also works.
                // ThreadStatic performs marginally better in scenarios where the same instance of the formatter 
                // is being hit from multiple threads that are using different timestamps.  
                // Performance is roughly equivalent in single-threaded scenarios.
                [ThreadStatic]
                private static Tuple<long, string> _lastTicksToTheSecond;
        
                public void FormatDate(DateTime dateToFormat, TextWriter writer)
                {
                    var ticksToTheSecond = dateToFormat.Ticks - dateToFormat.Ticks % TimeSpan.TicksPerSecond;
                    var lastToTheSecond = _lastTicksToTheSecond;
                    string toTheSecondString;
                    if (lastToTheSecond != null && lastToTheSecond.Item1 == ticksToTheSecond)
                    {
                        toTheSecondString = lastToTheSecond.Item2;
                    }
                    else
                    {
                        toTheSecondString = dateToFormat.ToString("yyyy-MM-dd HH:mm:ss");
                        _lastTicksToTheSecond = Tuple.Create(ticksToTheSecond, toTheSecondString);
                    }
                    writer.Write(toTheSecondString);
                    writer.Write(',');
                    var millisecond = dateToFormat.Millisecond;
                    if (millisecond < 100)
                        writer.Write('0');
                    if (millisecond < 10)
                        writer.Write('0');
                    writer.Write(millisecond);
                }
            }
        
        Show
        Stuart Lange added a comment - - edited Hello again. Sorry for letting this slide for a while. We've noticed that with the log4net 1.2.12 service release, this bug has actually gotten considerably worse. We have run into scenarios where the "to the second" component of the timestamp gets "stuck" and never updates again for the duration of the application. Unfortunately, I have been unable to produce a unit test that reliably reproduces the issue (the unit test above sometimes reproduces it), but my best guess is that this is related to the addition of the Hashtable in AbsoluteTimeDateFormatter in LOG4NET-323 . The Hashtable is cleared and read from outside the lock, which can lead to potential race conditions since the Hashtable is not thread-safe. I have created my own implementation of IDateFormatter that behaves identically to Iso8601DateFormatter, but avoids the bugs noted in this issue. My performance testing shows that it performs nearly identically to Iso8601DateFormatter in typical scenarios, and actually performs substantially better in multi-threaded scenarios. I chose to make the cache state ThreadStatic as that performs marginally better than the other options (static, instance) in my performance testing, but the other options also perform well and behave equivalently. Code is below. If you would like to see my performance testing code, I can send you that as well, but it uses some of my own custom components that would require unpacking to post in copy-paste friendly form. public class StandardDateFormatter : IDateFormatter { // Using ThreadStatic is a micro-optimization. Making it static or instance state also works. // ThreadStatic performs marginally better in scenarios where the same instance of the formatter // is being hit from multiple threads that are using different timestamps. // Performance is roughly equivalent in single-threaded scenarios. [ThreadStatic] private static Tuple< long , string> _lastTicksToTheSecond; public void FormatDate(DateTime dateToFormat, TextWriter writer) { var ticksToTheSecond = dateToFormat.Ticks - dateToFormat.Ticks % TimeSpan.TicksPerSecond; var lastToTheSecond = _lastTicksToTheSecond; string toTheSecondString; if (lastToTheSecond != null && lastToTheSecond.Item1 == ticksToTheSecond) { toTheSecondString = lastToTheSecond.Item2; } else { toTheSecondString = dateToFormat.ToString( "yyyy-MM-dd HH:mm:ss" ); _lastTicksToTheSecond = Tuple.Create(ticksToTheSecond, toTheSecondString); } writer.Write(toTheSecondString); writer.Write(','); var millisecond = dateToFormat.Millisecond; if (millisecond < 100) writer.Write('0'); if (millisecond < 10) writer.Write('0'); writer.Write(millisecond); } }
        Hide
        Dominik Psenner added a comment -

        Regarding ThreadStatic: I did not do performance timings, thus I do not know if it would perform better. My believe is that this attribute could tear down performance in heavily cross-threaded environments by practically disabling the cache. Generally I do not think it is wise to have code behaving one way in multi threading and another way in single threading. It makes the code terribly hard to maintain and therefore I would not want to take a foot on that road unless there are good arguments to do so.

        Regarding the drop of static: changing the cache to be instance specific would practically make it ineffective since it would happen in every formatter instance whereas it needs to be done only once every second. So that would have a performance impact too which probably is bigger than a pure lock. The way Stefan fixed LOG4NET-323 seems to be perfectly fine to me.

        It is widely known that thread safety does not come for free. If you believe that the performance impact is not neglectable I would like to encourage you to do some performance timings for these scenarios:

        • "Rev 1483378" in Single Thread
        • "Rev 1380139 + ThreadStatic" in Single Thread
        • "Rev 1483378" in Multi Thread
        • "Rev 1380139 + ThreadStatic" in Multi Thread
        Show
        Dominik Psenner added a comment - Regarding ThreadStatic: I did not do performance timings, thus I do not know if it would perform better. My believe is that this attribute could tear down performance in heavily cross-threaded environments by practically disabling the cache. Generally I do not think it is wise to have code behaving one way in multi threading and another way in single threading. It makes the code terribly hard to maintain and therefore I would not want to take a foot on that road unless there are good arguments to do so. Regarding the drop of static: changing the cache to be instance specific would practically make it ineffective since it would happen in every formatter instance whereas it needs to be done only once every second. So that would have a performance impact too which probably is bigger than a pure lock. The way Stefan fixed LOG4NET-323 seems to be perfectly fine to me. It is widely known that thread safety does not come for free. If you believe that the performance impact is not neglectable I would like to encourage you to do some performance timings for these scenarios: "Rev 1483378" in Single Thread "Rev 1380139 + ThreadStatic" in Single Thread "Rev 1483378" in Multi Thread "Rev 1380139 + ThreadStatic" in Multi Thread
        Hide
        Stuart Lange added a comment -

        I could also be convinced that locking the whole method is okay if all the static state were changed to instance state – that would also have fixed LOG4NET-323, without requiring the dictionary of strings.

        Show
        Stuart Lange added a comment - I could also be convinced that locking the whole method is okay if all the static state were changed to instance state – that would also have fixed LOG4NET-323 , without requiring the dictionary of strings.
        Hide
        Stuart Lange added a comment -

        Thanks for having a look. I have a feeling that people will get concerned about the performance implications of locking the entire method. I share that concern a bit, as this is a static lock object, so you are effectively synchronizing all calls to this method across the entire application. Have you considered removing the locking and instead marking all the mutable static fields ThreadStatic? http://msdn.microsoft.com/en-us/library/system.threadstaticattribute.aspx

        Show
        Stuart Lange added a comment - Thanks for having a look. I have a feeling that people will get concerned about the performance implications of locking the entire method. I share that concern a bit, as this is a static lock object, so you are effectively synchronizing all calls to this method across the entire application. Have you considered removing the locking and instead marking all the mutable static fields ThreadStatic? http://msdn.microsoft.com/en-us/library/system.threadstaticattribute.aspx
        Hide
        Dominik Psenner added a comment -

        I've commited a second fix as 1483378. Please look at both revisions as being a single patch. Sorry for the noise.

        Show
        Dominik Psenner added a comment - I've commited a second fix as 1483378. Please look at both revisions as being a single patch. Sorry for the noise.
        Hide
        Dominik Psenner added a comment -

        I suspected that the lock in there would be sewed too tight. I've commited a fix as of revision 1483375. Can you confirm that it fixes your issue?

        Show
        Dominik Psenner added a comment - I suspected that the lock in there would be sewed too tight. I've commited a fix as of revision 1483375. Can you confirm that it fixes your issue?
        Hide
        Stuart Lange added a comment -

        Switching the static state to instance state would help a lot. The race condition would still exist, but it would be restricted to a single instance. It is much more reasonable to expect that a single instance will only see a monotonically increasing sequence of timestamps than it is to expect the whole application to process a monotonically increasing sequence of timestamps.

        Show
        Stuart Lange added a comment - Switching the static state to instance state would help a lot. The race condition would still exist, but it would be restricted to a single instance. It is much more reasonable to expect that a single instance will only see a monotonically increasing sequence of timestamps than it is to expect the whole application to process a monotonically increasing sequence of timestamps.
        Hide
        Stuart Lange added a comment -

        The bug is in AbsoluteTimeDateFormatter, so it affects its inheritors – Iso8601TimeDateFormat and DateTimeFormatter – as well. SimpleDateFormatter is not affected. The race condition occurs in AbsoluteTimeDateFormatter.FormatDate – the input time, rounded to the nearest second, is compared to the "last time to the second". If they match, the "last time string" is written to the stream. However, these two operations are not done under a lock, so in-between the check and the write, another thread can change the "last time string".

        In typical operations, this isn't a problem. Our issue is that we have some loggers that batch logging events and defer writing to a background thread, and some that write logs in "real time". This means that simultaneously, we have logging events that are "fresh" being written and logging events that are one or two seconds old being written. This set-up causes us to regularly see "wrong" timestamps in our log files due to this race condition. I've tried to roughly replicate this scenario in the unit test.

        Thanks for your help.

        Show
        Stuart Lange added a comment - The bug is in AbsoluteTimeDateFormatter, so it affects its inheritors – Iso8601TimeDateFormat and DateTimeFormatter – as well. SimpleDateFormatter is not affected. The race condition occurs in AbsoluteTimeDateFormatter.FormatDate – the input time, rounded to the nearest second, is compared to the "last time to the second". If they match, the "last time string" is written to the stream. However, these two operations are not done under a lock, so in-between the check and the write, another thread can change the "last time string". In typical operations, this isn't a problem. Our issue is that we have some loggers that batch logging events and defer writing to a background thread, and some that write logs in "real time". This means that simultaneously, we have logging events that are "fresh" being written and logging events that are one or two seconds old being written. This set-up causes us to regularly see "wrong" timestamps in our log files due to this race condition. I've tried to roughly replicate this scenario in the unit test. Thanks for your help.
        Hide
        Dominik Psenner added a comment -

        Is this issue reproducable with one or more of the following formatters, too?

        • AbsoluteTimeDateFormatter
        • SimpleDateFormatter
        • DateTimeDateFormatter
        Show
        Dominik Psenner added a comment - Is this issue reproducable with one or more of the following formatters, too? AbsoluteTimeDateFormatter SimpleDateFormatter DateTimeDateFormatter

          People

          • Assignee:
            Unassigned
            Reporter:
            Stuart Lange
          • Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development