Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0-rc1
    • Fix Version/s: 2.0-rc2
    • Component/s: API
    • Labels:
      None

      Description

      Log4J's markers are not as flexible as markers in SLF4J.

      First, SLF4J's markers are mutable. By allowing markers to be mutable, I can change the relationship of markers to each other based upon runtime or business conditions.

      Second, and more importantly I think, is that essentially SLF4J markers have this parent/child relationship, much like Log4J, except that in SLF4J, I can essentially have a marker with multiple parents. For example, I might want this structure:

      • Animal
        • Bird
          • Duck
        • Mammal
          • Bear
          • Dolphin
      • Travels by
        • Water
          • Duck
          • Dolphin
        • Land
          • Duck
          • Bear
        • Air
          • Duck

      Of course, this is a contrived example, but I wanted to describe the relationships. Now, if I wanted to filter based on markers that travel by Water for some appenders, and another appender wants to filter by Mammals, I can't simply use the single marker of Dolphin.

      Either we need to reverse the marker relationship so that it contains its children, much like SLF4J, or we allow markers to have multiple parents, which I prefer because it could make it more succinct to define:

      private static final Marker BY_LAND = MarkerManager.getMarker("BY_LAND");
      private static final Marker BY_WATER = MarkerManager.getMarker("BY_WATER");
      private static final Marker DUCK = MarkerManager.getMarker("DUCK", BY_LAND, BY_WATER);
      

      As for the Marker API, we would either need to change getParent to getParents, or get rid of the getParent method from the API and just rely on the isInstanceOf method to handle checking multiple parents by looking at private member variables (my preference)

      1. log4j2-585-concept.patch
        50 kB
        Bruce Brouwer
      2. CurrentMarkerBenchmark.java
        2 kB
        Bruce Brouwer
      3. ConceptMarkerBenchmark.java
        2 kB
        Bruce Brouwer

        Activity

        Hide
        Ralph Goers added a comment -

        The choice to make Markers immutable and not support multiple parents in Log4j 2 was deliberate on my part.

        Because Markers in SLF4J are mutable they suffer from performance bottlenecks with multiple threads. If you look at https://github.com/qos-ch/slf4j/blob/master/slf4j-api/src/main/java/org/slf4j/helpers/BasicMarker.java you will see that many of the methods are synchronized, in particular the contains() method. This method is called every time you filter on a marker. If you have an exact match it might not be so bad since it will exit very quickly, but if you are comparing against parents every comparison loops while synchronized and calls the contains method recursively.

        I can tell you from personal experience that this exact scenario showed up in one of my production systems as the major performance bottleneck in the application, which is why I decided to implement it this way.

        Show
        Ralph Goers added a comment - The choice to make Markers immutable and not support multiple parents in Log4j 2 was deliberate on my part. Because Markers in SLF4J are mutable they suffer from performance bottlenecks with multiple threads. If you look at https://github.com/qos-ch/slf4j/blob/master/slf4j-api/src/main/java/org/slf4j/helpers/BasicMarker.java you will see that many of the methods are synchronized, in particular the contains() method. This method is called every time you filter on a marker. If you have an exact match it might not be so bad since it will exit very quickly, but if you are comparing against parents every comparison loops while synchronized and calls the contains method recursively. I can tell you from personal experience that this exact scenario showed up in one of my production systems as the major performance bottleneck in the application, which is why I decided to implement it this way.
        Hide
        Gary Gregory added a comment - - edited

        Would it possible for Log4J 2 to work with an alternate implementation of the Marker interface, a MutableMarker? The default impl being the one we have now, immutable.

        Show
        Gary Gregory added a comment - - edited Would it possible for Log4J 2 to work with an alternate implementation of the Marker interface, a MutableMarker? The default impl being the one we have now, immutable.
        Hide
        Bruce Brouwer added a comment -

        About the performance, I suspected what Ralph is saying would be a very real concern. I looked at the way BasicMarker is implemented and there is nothing that requires us to extend from BasicMarker. I was playing around with an implementation that used a ReentrantReadWriteLock. In most cases, the time spend modifying markers is only in the startup of the app. So I agree, slapping a synchronized on the most important methods is a huge performance bottleneck. But with the ReentrantReadWriteLock, the major time of an application would cause no blocking at all.

        I also have some ideas on how we could get the performance of the regular Log4J markers better than they are today at the expense of a little bit more memory.

        Also, if we made the Log4J markers mutable, then we could have the SLF4J implementation simply delegate to the real Log4J markers. Right now, the two are completely separate. If you have code that mixes Log4J markers with SLF4J markers (because of some 3rd party library that uses SLF4J) the hierarchies can be completely different. This is probably not what people would expect to have happen. I think it will also have real issues when used with the MarkerFilter, which would be referencing a real log4j Marker while comparing against an SLF4J marker.

        Another thing that I noticed is that I think the SLF4J implementation is broken. SLF4J intends the pointers to be pointing at the children. Log4J intends the pointers to be to the parent. Because of this, the implementation of MarkerWrapper.isInstanceOf is backwards.

        I would like to fix this immutability issue, but the issue of allowing multiple parents I think more important, and one that is easy to fix. Would I have support of changing these things if I submitted a patch? Especially if I took careful consideration for performance issues?

        Show
        Bruce Brouwer added a comment - About the performance, I suspected what Ralph is saying would be a very real concern. I looked at the way BasicMarker is implemented and there is nothing that requires us to extend from BasicMarker. I was playing around with an implementation that used a ReentrantReadWriteLock. In most cases, the time spend modifying markers is only in the startup of the app. So I agree, slapping a synchronized on the most important methods is a huge performance bottleneck. But with the ReentrantReadWriteLock, the major time of an application would cause no blocking at all. I also have some ideas on how we could get the performance of the regular Log4J markers better than they are today at the expense of a little bit more memory. Also, if we made the Log4J markers mutable, then we could have the SLF4J implementation simply delegate to the real Log4J markers. Right now, the two are completely separate. If you have code that mixes Log4J markers with SLF4J markers (because of some 3rd party library that uses SLF4J) the hierarchies can be completely different. This is probably not what people would expect to have happen. I think it will also have real issues when used with the MarkerFilter, which would be referencing a real log4j Marker while comparing against an SLF4J marker. Another thing that I noticed is that I think the SLF4J implementation is broken. SLF4J intends the pointers to be pointing at the children. Log4J intends the pointers to be to the parent. Because of this, the implementation of MarkerWrapper.isInstanceOf is backwards. I would like to fix this immutability issue, but the issue of allowing multiple parents I think more important, and one that is easy to fix. Would I have support of changing these things if I submitted a patch? Especially if I took careful consideration for performance issues?
        Hide
        Ralph Goers added a comment -

        Yes, feel free to submit a patch. I have no problem making the implementation more like SLF4J if it performs well.

        Show
        Ralph Goers added a comment - Yes, feel free to submit a patch. I have no problem making the implementation more like SLF4J if it performs well.
        Hide
        Ralph Goers added a comment -

        If you are going to use a lock you might consider just using one of the concurrent collections.

        Show
        Ralph Goers added a comment - If you are going to use a lock you might consider just using one of the concurrent collections.
        Hide
        Ralph Goers added a comment -

        I looked at BasicMarker again. It has changed a bit I guess. The contains method isn't synchronized any more but it is now accessing the list in a way that isn't thread safe. It is looping over the reference list without any kind of locking while the add and remove methods are synchronized.

        Interestingly, the iterator method is synchronized but returns an Iterator that isn't thread-safe. I'm not even sure why that method is there.

        Show
        Ralph Goers added a comment - I looked at BasicMarker again. It has changed a bit I guess. The contains method isn't synchronized any more but it is now accessing the list in a way that isn't thread safe. It is looping over the reference list without any kind of locking while the add and remove methods are synchronized. Interestingly, the iterator method is synchronized but returns an Iterator that isn't thread-safe. I'm not even sure why that method is there.
        Hide
        Bruce Brouwer added a comment - - edited

        Cool. So, in my patch, when I make log4j Markers mutable, I will make sure read operations are as performant as possible, but for operations that modify markers, I am not going to spend a lot of effort making those highly performant as I think read operations are far more important.

        Of the items in the concurrent collections, the one that seemed promising to me for this situation was ConcurrentHashMap, but it does not maintain the original order and I don't really need the Map feature, just Set. The only place where I think order might have some impact when allowing multiple parents is in the toString method. It wouldn't keep the order that was specified when creating the marker with parents. Do you think order is important in the toString method? Could I get away with just sorting the parent names for .toString()? If so, then I can seriously consider ConcurrentHashMap. Otherwise I'll have to go with another collection and use the ReentrantReadWriteLock.

        Show
        Bruce Brouwer added a comment - - edited Cool. So, in my patch, when I make log4j Markers mutable, I will make sure read operations are as performant as possible, but for operations that modify markers, I am not going to spend a lot of effort making those highly performant as I think read operations are far more important. Of the items in the concurrent collections, the one that seemed promising to me for this situation was ConcurrentHashMap, but it does not maintain the original order and I don't really need the Map feature, just Set. The only place where I think order might have some impact when allowing multiple parents is in the toString method. It wouldn't keep the order that was specified when creating the marker with parents. Do you think order is important in the toString method? Could I get away with just sorting the parent names for .toString()? If so, then I can seriously consider ConcurrentHashMap. Otherwise I'll have to go with another collection and use the ReentrantReadWriteLock.
        Hide
        Ralph Goers added a comment -

        You could use CopyOnWriteArrayList. If adds and removals are infrequent it works OK. But using a ReentrantReadWriteLock is OK.

        The only problem I see is that I really the issue of the pointers. To me, it makes sense to have the pointers point to their parent and to add markers to the parent and not the other way around.

        Show
        Ralph Goers added a comment - You could use CopyOnWriteArrayList. If adds and removals are infrequent it works OK. But using a ReentrantReadWriteLock is OK. The only problem I see is that I really the issue of the pointers. To me, it makes sense to have the pointers point to their parent and to add markers to the parent and not the other way around.
        Hide
        Gary Gregory added a comment - - edited

        Why can't we have 2 Marker implementations? Mutable and immutable? Just like a Logger can have a message factory, let it have a marker factory. The factory pattern is already there is makes a nice parallel here b/w markers and messages being pluggable in the same way.

        Show
        Gary Gregory added a comment - - edited Why can't we have 2 Marker implementations? Mutable and immutable? Just like a Logger can have a message factory, let it have a marker factory. The factory pattern is already there is makes a nice parallel here b/w markers and messages being pluggable in the same way.
        Hide
        Bruce Brouwer added a comment -

        Interesting. I don't know if it makes sense for each logger to have a marker factory, but in the way we have different LoggerContextFactory implementations, we could have different MarkerFactory implementations. log4j-slf4j-impl could provide an alternate MarkerFactory. I don't know if I like the idea of simply including log4j-slf4j-impl and having it switch the implementation of MarkerFactory without me realizing it. But I suppose that is what happens when I include log4j-core with LoggerContextFactory. It seems different to me, somehow. I would expect log4j-slf4j-impl to be a thin wrapper around the real log4j implementation, including the markers.

        I was actually thinking a bit about the Marker interface. If we don't go with the MarkerFactory idea, what is the point of having Marker be an interface? It should be a concrete final class to prevent anyone or anything making other kinds of Markers. I could see weird things happening if we pass an SLF4J marker with one hierarchy passed into a Log4J marker's .isInstanceOf method. We could run into a situation where slf4jMarker.isInstanceOf(log4jMarker) returns true and also log4jMarker.isInstanceOf(slf4jMarker) returns true, event though they are not themselves equal. This doesn't really make good logical sense. The only impact I think would be that the log4j-slf4j-impl MarkerWrapper couldn't implement Marker, but would rather need to provide a getter.

        I'm willing to put in some time developing this, even at the risk of not using it if it proves to be too big of a performance hit. But if we don't go down this road, we run into the problem of the SLF4J marker hierarchy possibly being inconsistent with the Log4j marker hierarchy and having some of the weird behavior I described.

        Show
        Bruce Brouwer added a comment - Interesting. I don't know if it makes sense for each logger to have a marker factory, but in the way we have different LoggerContextFactory implementations, we could have different MarkerFactory implementations. log4j-slf4j-impl could provide an alternate MarkerFactory. I don't know if I like the idea of simply including log4j-slf4j-impl and having it switch the implementation of MarkerFactory without me realizing it. But I suppose that is what happens when I include log4j-core with LoggerContextFactory. It seems different to me, somehow. I would expect log4j-slf4j-impl to be a thin wrapper around the real log4j implementation, including the markers. I was actually thinking a bit about the Marker interface. If we don't go with the MarkerFactory idea, what is the point of having Marker be an interface? It should be a concrete final class to prevent anyone or anything making other kinds of Markers. I could see weird things happening if we pass an SLF4J marker with one hierarchy passed into a Log4J marker's .isInstanceOf method. We could run into a situation where slf4jMarker.isInstanceOf(log4jMarker) returns true and also log4jMarker.isInstanceOf(slf4jMarker) returns true, event though they are not themselves equal. This doesn't really make good logical sense. The only impact I think would be that the log4j-slf4j-impl MarkerWrapper couldn't implement Marker, but would rather need to provide a getter. I'm willing to put in some time developing this, even at the risk of not using it if it proves to be too big of a performance hit. But if we don't go down this road, we run into the problem of the SLF4J marker hierarchy possibly being inconsistent with the Log4j marker hierarchy and having some of the weird behavior I described.
        Hide
        Bruce Brouwer added a comment - - edited

        Here is a log4j2-585-concept.patch that outlines my ideas for the Markers. It is merely a concept and not intended to be committed yet. I'm sure I've probably gone too far in some areas, but hear me out first.

        For performance, I checked with JMH. So long as the marker hierarchy is no more than 5 deep, the current Log4j code is faster, but after 5 deep, my concept is faster than the current Log4j code. For the normal case of only one parent, the current Log4j code can perform about 600,000 ops/ms, while my concept performs about 200,000 ops/ms. A simpler version of my concept that did not include the allParentNames variable was much slower, clocking in at only about 10,000 ops/ms.

        The big driver behind my concept was to make Log4j Markers just as powerful as slf4j. This means primarily two things: mutability and multiple parents.

        First, I got rid of MarkerManager and turned Marker into a concrete, final class. By making Marker an interface, clients might be inclined to implement the interface themselves and cause issues because the parent hierarchy of their markers would likely not match the parent hierarchy of Log4j's markers. It is because of this that I don't like the idea of pluggable Marker factories.

        Next, my ReadWriteLock is static. I did this to avoid creating a ton of lock objects for each Marker. The majority of time, marker hierarchies are not changing, so a lock that blocks most marker functions during an update is likely not a concern. Furthermore, the most prevalent cases, such as calling isInstanceOf, are not even guarded by a lock and allow ConcurrentHashMap to provide all the performance possible.

        getParents() is now a Set<Marker> instead of a simple Marker. This allows multiple parents. I went with a custom Set implementation to be able to react to changes to the parents set, which is used primarily to keep allParentNames up to date.

        The static methods that used to be on MarkerManager are now on Marker.

        Marker.get works exactly like MarkerManager.getMarker. An overloaded version allows you to indicate if you want a marker created if it does not exist. This was necessary for use in the Slf4j-impl, but I could see it being valuable elsewhere, too.

        Marker.define works a little bit like MarkerManager.getMarker(name, parent), but because Markers are now mutable, it ensures that the parents become the list provided here. I like the name define better because it is clear to me that this will make the marker look the way I just described. I did not like MarkerManager.getMarker(name, parent) because it does not guarantee me that the returned marker even has the specified parent. If the marker was created earlier to this call, it won't have my specified parent.

        Marker.undefine allows markers to be removed, which is a feature of slf4j and it was fairly easy to support here.

        So, is it worth a bit of a performance penalty to get all these features in Markers?

        Show
        Bruce Brouwer added a comment - - edited Here is a log4j2-585-concept.patch that outlines my ideas for the Markers. It is merely a concept and not intended to be committed yet. I'm sure I've probably gone too far in some areas, but hear me out first. For performance, I checked with JMH. So long as the marker hierarchy is no more than 5 deep, the current Log4j code is faster, but after 5 deep, my concept is faster than the current Log4j code. For the normal case of only one parent, the current Log4j code can perform about 600,000 ops/ms, while my concept performs about 200,000 ops/ms. A simpler version of my concept that did not include the allParentNames variable was much slower, clocking in at only about 10,000 ops/ms. The big driver behind my concept was to make Log4j Markers just as powerful as slf4j. This means primarily two things: mutability and multiple parents. First, I got rid of MarkerManager and turned Marker into a concrete, final class. By making Marker an interface, clients might be inclined to implement the interface themselves and cause issues because the parent hierarchy of their markers would likely not match the parent hierarchy of Log4j's markers. It is because of this that I don't like the idea of pluggable Marker factories. Next, my ReadWriteLock is static. I did this to avoid creating a ton of lock objects for each Marker. The majority of time, marker hierarchies are not changing, so a lock that blocks most marker functions during an update is likely not a concern. Furthermore, the most prevalent cases, such as calling isInstanceOf, are not even guarded by a lock and allow ConcurrentHashMap to provide all the performance possible. getParents() is now a Set<Marker> instead of a simple Marker. This allows multiple parents. I went with a custom Set implementation to be able to react to changes to the parents set, which is used primarily to keep allParentNames up to date. The static methods that used to be on MarkerManager are now on Marker. Marker.get works exactly like MarkerManager.getMarker. An overloaded version allows you to indicate if you want a marker created if it does not exist. This was necessary for use in the Slf4j-impl, but I could see it being valuable elsewhere, too. Marker.define works a little bit like MarkerManager.getMarker(name, parent), but because Markers are now mutable, it ensures that the parents become the list provided here. I like the name define better because it is clear to me that this will make the marker look the way I just described. I did not like MarkerManager.getMarker(name, parent) because it does not guarantee me that the returned marker even has the specified parent. If the marker was created earlier to this call, it won't have my specified parent. Marker.undefine allows markers to be removed, which is a feature of slf4j and it was fairly easy to support here. So, is it worth a bit of a performance penalty to get all these features in Markers?
        Hide
        Gary Gregory added a comment -

        I did not look at your patch, but here are a few thoughts.

        If a marker name is immutable but you want to change the hierarchy and this is a rare case, not something I do all the time, would it be possible to rebuild the Marker hierarchy with new immutable markers? A sort of builder pattern but for the whole tree.

        Other wacky idea: Could the hierarchy be based on the name like Java FQCN and Log4j logger names?

        I am not thinking of other framework's definition of a marker here.

        I find it confusing that we have different kinds of hierarchies for Makers and Loggers.

        Show
        Gary Gregory added a comment - I did not look at your patch, but here are a few thoughts. If a marker name is immutable but you want to change the hierarchy and this is a rare case, not something I do all the time, would it be possible to rebuild the Marker hierarchy with new immutable markers? A sort of builder pattern but for the whole tree. Other wacky idea: Could the hierarchy be based on the name like Java FQCN and Log4j logger names? I am not thinking of other framework's definition of a marker here. I find it confusing that we have different kinds of hierarchies for Makers and Loggers.
        Hide
        Bruce Brouwer added a comment - - edited

        The problem with creating all new immutable markers is that there are likely to be references to the old markers all throughout the code that are now invalid. Now, I suppose I have opened up that possibility now by allowing markers to be undefined and then redefined, but I think that is an even more rare case that we don't need to be concerned with.

        The other idea is something that I hadn't considered of before. I'm assuming you mean marker "x.y.z" would have a parent of "x.y", which has a parent of "x". However, this fails to give the ability to have multiple parents

        If we went a completely different route and just had markers be interfaces, I could see code like this:

        public interface MyFirstParentMarker extends Marker { }
        public interface MySecondParentMarker extends Marker { }
        public interface MyChildMarker extends MyFirstParentMarker, MySecondParentMarker { }
        

        This has a few problems, too. First, I have no idea how we could get it to work with slf4j which is based upon marker names being strings, not a Java type hierarchy. Second, how do we reference these markers? The best idea I can come up with here is to change the log4j-api to take a Class<? extends Marker> marker instead of Marker marker. Now there is no need for a MarkerManager as the Java class loader takes care of it for me.

        I like this idea, except that I don't see a way to make it work with slf4j.

        Show
        Bruce Brouwer added a comment - - edited The problem with creating all new immutable markers is that there are likely to be references to the old markers all throughout the code that are now invalid. Now, I suppose I have opened up that possibility now by allowing markers to be undefined and then redefined, but I think that is an even more rare case that we don't need to be concerned with. The other idea is something that I hadn't considered of before. I'm assuming you mean marker "x.y.z" would have a parent of "x.y", which has a parent of "x". However, this fails to give the ability to have multiple parents If we went a completely different route and just had markers be interfaces, I could see code like this: public interface MyFirstParentMarker extends Marker { } public interface MySecondParentMarker extends Marker { } public interface MyChildMarker extends MyFirstParentMarker, MySecondParentMarker { } This has a few problems, too. First, I have no idea how we could get it to work with slf4j which is based upon marker names being strings, not a Java type hierarchy. Second, how do we reference these markers? The best idea I can come up with here is to change the log4j-api to take a Class<? extends Marker> marker instead of Marker marker . Now there is no need for a MarkerManager as the Java class loader takes care of it for me. I like this idea, except that I don't see a way to make it work with slf4j.
        Hide
        Bruce Brouwer added a comment -

        So another idea I had is along the lines of what Gary was proposing. So we have MarkerManager today, which is analogous to LogManager. But instead of MarkerManager holding onto a static map of markers, what if the LoggerContext is what actually kept track of all the markers, like it keeps track of all the Loggers.

        And like LoggerContext now returns LoggerProviders, LoggerContext could return MarkerProviders, while the MarkerManager static methods would return simple Markers. This would allow the public Marker API to be simpler than the Marker SPI. The Marker SPI could have the methods that make markers mutable. The only thing I can think that the API would be missing that the SPI has would be the getParents method.

        With this approach, there could be two LoggerContexts, each with their own marker hierarchy, rather than one set of markers that everything shares.

        Does this approach sound more palatable? I don't think it would be too hard to switch my concept over to this approach.

        Also, are the issues I've identified recognized as real issues? (e.g. slf4j-impl markers having almost no relation to log4j markers) If the rest of the team doesn't agree with me, I'll stop spending time on this. I will be using log4j 2 primarily through the slf4j interface and I personally see this as a real problem that needs to be fixed before 2.0.

        Show
        Bruce Brouwer added a comment - So another idea I had is along the lines of what Gary was proposing. So we have MarkerManager today, which is analogous to LogManager. But instead of MarkerManager holding onto a static map of markers, what if the LoggerContext is what actually kept track of all the markers, like it keeps track of all the Loggers. And like LoggerContext now returns LoggerProviders, LoggerContext could return MarkerProviders, while the MarkerManager static methods would return simple Markers. This would allow the public Marker API to be simpler than the Marker SPI. The Marker SPI could have the methods that make markers mutable. The only thing I can think that the API would be missing that the SPI has would be the getParents method. With this approach, there could be two LoggerContexts, each with their own marker hierarchy, rather than one set of markers that everything shares. Does this approach sound more palatable? I don't think it would be too hard to switch my concept over to this approach. Also, are the issues I've identified recognized as real issues? (e.g. slf4j-impl markers having almost no relation to log4j markers) If the rest of the team doesn't agree with me, I'll stop spending time on this. I will be using log4j 2 primarily through the slf4j interface and I personally see this as a real problem that needs to be fixed before 2.0.
        Hide
        Ralph Goers added a comment - - edited

        You mention that your code is slower so long as the marker depth is less than 5. I would guess that means that about 98% of the time it will be slower and about 50% of the time it will be 3 times slower. I would not be in favor of that.

        I will take a look at this this weekend and see how this can be done.

        I'd also really like to understand a real use case you might have for dynamically changing Markers. Without that requirement it would be easy to just add

        getMarker(String name, String[] parents)
        getMarker(String name, Marker[] parents)
        

        to MarkerManager with what I would imagine to be very little impact.

        Can you add the code you used to test your implementation to this issue if it isn't already part of the patch? I'd like to use that to test whatever I might do.

        Show
        Ralph Goers added a comment - - edited You mention that your code is slower so long as the marker depth is less than 5. I would guess that means that about 98% of the time it will be slower and about 50% of the time it will be 3 times slower. I would not be in favor of that. I will take a look at this this weekend and see how this can be done. I'd also really like to understand a real use case you might have for dynamically changing Markers. Without that requirement it would be easy to just add getMarker( String name, String [] parents) getMarker( String name, Marker[] parents) to MarkerManager with what I would imagine to be very little impact. Can you add the code you used to test your implementation to this issue if it isn't already part of the patch? I'd like to use that to test whatever I might do.
        Hide
        Bruce Brouwer added a comment -

        Here are the benchmark classes I was using. They're pretty simple, just dealing with a single parent for each marker in order to compare the current vs. concept.

        I had an idea to try out myself with volatile references to immutable arrays where the array references are replaced when the parents change. Maybe that will be faster to iterate over than checking if something is in a ConcurrentHashMap.

        Show
        Bruce Brouwer added a comment - Here are the benchmark classes I was using. They're pretty simple, just dealing with a single parent for each marker in order to compare the current vs. concept. I had an idea to try out myself with volatile references to immutable arrays where the array references are replaced when the parents change. Maybe that will be faster to iterate over than checking if something is in a ConcurrentHashMap.
        Hide
        Gary Gregory added a comment -

        If we want

        getMarker(String name, String[] parents)
        getMarker(String name, Marker[] parents)
        

        let make sure it really is:

        getMarker(String name, String... parents)
        getMarker(String name, Marker... parents)
        
        Show
        Gary Gregory added a comment - If we want getMarker( String name, String [] parents) getMarker( String name, Marker[] parents) let make sure it really is: getMarker( String name, String ... parents) getMarker( String name, Marker... parents)
        Hide
        Ralph Goers added a comment -

        Bruce, that is too funny as that is exactly what I implemented this morning.

        I am wondering, if the user makes a call to getMarker(name, parent1, parent2) should parent1 and parent2 be marked as immutable so that only markers that are dynamically added can be removed?

        Show
        Ralph Goers added a comment - Bruce, that is too funny as that is exactly what I implemented this morning. I am wondering, if the user makes a call to getMarker(name, parent1, parent2) should parent1 and parent2 be marked as immutable so that only markers that are dynamically added can be removed?
        Hide
        Gary Gregory added a comment -

        Don't you mean "Gary"? Had coffee yet ?

        Show
        Gary Gregory added a comment - Don't you mean "Gary"? Had coffee yet ?
        Hide
        Ralph Goers added a comment -

        No - I was referring to Bruce's suggestion to use a volatile reference to an array. I implemented that last night.

        The more I think about it the more I like the idea that parents added on the constructor cannot be removed.

        Show
        Ralph Goers added a comment - No - I was referring to Bruce's suggestion to use a volatile reference to an array. I implemented that last night. The more I think about it the more I like the idea that parents added on the constructor cannot be removed.
        Hide
        Bruce Brouwer added a comment -

        Why make that rule that parents added during construction cannot be removed? If we allow any modification, it doesn't make sense to me why we would restrict those modifications. I suppose I'm asking you to convince me why that is better.

        Show
        Bruce Brouwer added a comment - Why make that rule that parents added during construction cannot be removed? If we allow any modification, it doesn't make sense to me why we would restrict those modifications. I suppose I'm asking you to convince me why that is better.
        Hide
        Ralph Goers added a comment -

        Lets say I want to create a parent marker called Audit and then I create a marker called Transfer and I want to insure that that relationship is permanent. By specifying them on the getMarker call I can make that happen. If I later want to add another parent to Transfer I can still do that by using the Marker's add method. This also won't clash with SLF4J since it doesn't support adding markers in MarkerFactory.

        Show
        Ralph Goers added a comment - Lets say I want to create a parent marker called Audit and then I create a marker called Transfer and I want to insure that that relationship is permanent. By specifying them on the getMarker call I can make that happen. If I later want to add another parent to Transfer I can still do that by using the Marker's add method. This also won't clash with SLF4J since it doesn't support adding markers in MarkerFactory.
        Hide
        Bruce Brouwer added a comment -

        So, are you saying that the only benefit is that I am guaranteed that some relationships cannot be modified? That seems like it would add complexity if some relationships can be modified and not others. Why can't I just make sure my client code never modifies those relationships?

        I went ahead and tried the volatile immutable arrays, and it definitely helps. Here's my results:

        Depth Current Concept worst Concept average
        3 340K ops/ms 280K ops/ms (17% slower) 380K ops/ms (12% faster)
        2 680K ops/ms 360K ops/ms (47% slower) 430K ops/ms (36% slower)
        1 1230K ops/ms 500K ops/ms (60% slower) same as worst
        0 2550K ops/ms 2000K ops/ms (21% slower) same as worst

        The difference between "Concept Worst" and "Concept Average" is because it might not take x number of checks for a parent that is x levels away. On average, it would take only x/2 checks since all parent names are merged together for testing purposes. As you can see, the average case is already faster now at 3 levels deep. I don't think there is any way to get the 0 and 1 level deep tests to be faster than what we have currently as the current implementation simply needs to check one or two variables, rather than iterating over an array.

        But in the end, we're talking about hundreds of millions of isInstanceOf invocations per second. This marker test is only one small part of the entire logging performance picture. And it is only going to cause a small (and I think very small) penalty for those who even use the markers.

        The benefit is a more flexible marker hierarchy and full support for slf4j.

        Show
        Bruce Brouwer added a comment - So, are you saying that the only benefit is that I am guaranteed that some relationships cannot be modified? That seems like it would add complexity if some relationships can be modified and not others. Why can't I just make sure my client code never modifies those relationships? I went ahead and tried the volatile immutable arrays, and it definitely helps. Here's my results: Depth Current Concept worst Concept average 3 340K ops/ms 280K ops/ms (17% slower) 380K ops/ms (12% faster) 2 680K ops/ms 360K ops/ms (47% slower) 430K ops/ms (36% slower) 1 1230K ops/ms 500K ops/ms (60% slower) same as worst 0 2550K ops/ms 2000K ops/ms (21% slower) same as worst The difference between "Concept Worst" and "Concept Average" is because it might not take x number of checks for a parent that is x levels away. On average, it would take only x/2 checks since all parent names are merged together for testing purposes. As you can see, the average case is already faster now at 3 levels deep. I don't think there is any way to get the 0 and 1 level deep tests to be faster than what we have currently as the current implementation simply needs to check one or two variables, rather than iterating over an array. But in the end, we're talking about hundreds of millions of isInstanceOf invocations per second. This marker test is only one small part of the entire logging performance picture. And it is only going to cause a small (and I think very small) penalty for those who even use the markers. The benefit is a more flexible marker hierarchy and full support for slf4j.
        Hide
        Ralph Goers added a comment -

        Can you tell me what the command line is that you are using to invoke the benchmarks? Or are you running them with no options?

        Show
        Ralph Goers added a comment - Can you tell me what the command line is that you are using to invoke the benchmarks? Or are you running them with no options?
        Hide
        Bruce Brouwer added a comment -

        java -jar target/microbenchmark.jar -t 8

        So, 8 concurrent threads running on a Windows 7 4-core machine.

        Show
        Bruce Brouwer added a comment - java -jar target/microbenchmark.jar -t 8 So, 8 concurrent threads running on a Windows 7 4-core machine.
        Hide
        Ralph Goers added a comment -

        I spent quite a bit of time yesterday working on this and running the benchmarks against various implementations and I think I finally came up with something that is OK. My results are:

        Depth Current Multiple Parents
        4 516K ops/ms 276K ops/ms (47% slower)
        3 636K ops/ms 381K ops/ms (40% slower)
        2 903K ops/ms 451K ops/ms (50% slower)
        1 1,661K ops/ms 1,356K ops/ms (18% slower)
        0 2,882K ops/ms 2,827K ops/ms (2% slower)

        I'm OK with this primarily since I expect most Markers will only be a parent or a parent and a child.

        One thing interesting to note is that your tests of the current implementation seem to degrade much more significantly than mine. I suspect that is due to the different machines. I am running on what is now a fairly old Macbook Pro. It has a 2.5 GHz Intel Core i7 processor. It has 4 physical cores with 2 logical cores per physical core.

        In thinking about the immutability I have come to the conclusion that I can't do it. If I make the relationship between Child and Parent immutable there is nothing that would prevent the Parent from having a Grandparent added to it, which kind of makes the whole idea pointless.

        I still have to work on the SLF4J wrapper but I hope I can get it committed today.

        Show
        Ralph Goers added a comment - I spent quite a bit of time yesterday working on this and running the benchmarks against various implementations and I think I finally came up with something that is OK. My results are: Depth Current Multiple Parents 4 516K ops/ms 276K ops/ms (47% slower) 3 636K ops/ms 381K ops/ms (40% slower) 2 903K ops/ms 451K ops/ms (50% slower) 1 1,661K ops/ms 1,356K ops/ms (18% slower) 0 2,882K ops/ms 2,827K ops/ms (2% slower) I'm OK with this primarily since I expect most Markers will only be a parent or a parent and a child. One thing interesting to note is that your tests of the current implementation seem to degrade much more significantly than mine. I suspect that is due to the different machines. I am running on what is now a fairly old Macbook Pro. It has a 2.5 GHz Intel Core i7 processor. It has 4 physical cores with 2 logical cores per physical core. In thinking about the immutability I have come to the conclusion that I can't do it. If I make the relationship between Child and Parent immutable there is nothing that would prevent the Parent from having a Grandparent added to it, which kind of makes the whole idea pointless. I still have to work on the SLF4J wrapper but I hope I can get it committed today.
        Hide
        Ralph Goers added a comment -

        Changes were applied in revision 1585355. Please verify.

        Show
        Ralph Goers added a comment - Changes were applied in revision 1585355. Please verify.
        Hide
        Bruce Brouwer added a comment - - edited

        So, here are my thoughts. Maybe some of these should be made into new JIRAs.

        • Are we ok with getMarker sometimes creating a marker with the specified parents and sometimes returning a marker with a potentially different parent hierarchy? This is why my concept renamed those methods to define(name, parents...).
        • Do we want a version of getMarker (or define) that takes a list of parent marker names, in addition to the new one that takes a list of marker instances?
        • Should it be ok to add a parent twice in the hierarchy, once as a grand parent, once as a parent? If I was allowed to add marker X as an immediate parent of Y when X already existed as a grandparent, then removing the X grandparent would allow marker Y to remain an instance of X. The current implementation would not allow me to add X as a parent of Y and thus removing grandparent X would make Y no longer an instance of X, even though I explicitly specified I wanted X to be an immediate parent of Y. (This one is pretty minor and I'm willing to accept what is done)
        • Log4jMarker.getParents() should return a copy of the array so I can't change the contents from outside the control of log4j
        • In slf4j-impl Log4jMarkerFactory.getDetachedMarker(), my impression was that it should create a marker that wasn't actually attached yet to log4j, but by adding it to another attached marker would then make it attached. This implementation essentially creates the slf4j marker as already attached.
        Show
        Bruce Brouwer added a comment - - edited So, here are my thoughts. Maybe some of these should be made into new JIRAs. Are we ok with getMarker sometimes creating a marker with the specified parents and sometimes returning a marker with a potentially different parent hierarchy? This is why my concept renamed those methods to define(name, parents...). Do we want a version of getMarker (or define) that takes a list of parent marker names, in addition to the new one that takes a list of marker instances? Should it be ok to add a parent twice in the hierarchy, once as a grand parent, once as a parent? If I was allowed to add marker X as an immediate parent of Y when X already existed as a grandparent, then removing the X grandparent would allow marker Y to remain an instance of X. The current implementation would not allow me to add X as a parent of Y and thus removing grandparent X would make Y no longer an instance of X, even though I explicitly specified I wanted X to be an immediate parent of Y. (This one is pretty minor and I'm willing to accept what is done) Log4jMarker.getParents() should return a copy of the array so I can't change the contents from outside the control of log4j In slf4j-impl Log4jMarkerFactory.getDetachedMarker(), my impression was that it should create a marker that wasn't actually attached yet to log4j, but by adding it to another attached marker would then make it attached. This implementation essentially creates the slf4j marker as already attached.
        Hide
        Ralph Goers added a comment -
        • It isn't clear to me why define() is a better name than getMarker(). define() isn't going to really define anything if the Marker already exists, so I think getMarker is a better name.
        • Yes, I should add the getMarker(String name, String... parents).
        • No, you can't add a Marker twice in the hierarchy and create a cycle. SLF4J doesn't allow it either and it would make the checking even slower and uglier.
        • You are correct that it should make a copy of the array. I will fix that.
        • Yes. The performance difference between doing this == marker and this.equals(marker) is dramatic. The only way to insure there is only a single marker with a given name is for them to be attached, so they aren't supported in Log4j. I have no idea what the usefulness of detached markers are other then to confuse the hell out of everyone and create multiple Markers with the same name and different hierarchies. SLF4J doesn't provide a way for a detached Marker to become attached and Log4j doesn't even support the concept.
        Show
        Ralph Goers added a comment - It isn't clear to me why define() is a better name than getMarker(). define() isn't going to really define anything if the Marker already exists, so I think getMarker is a better name. Yes, I should add the getMarker(String name, String... parents). No, you can't add a Marker twice in the hierarchy and create a cycle. SLF4J doesn't allow it either and it would make the checking even slower and uglier. You are correct that it should make a copy of the array. I will fix that. Yes. The performance difference between doing this == marker and this.equals(marker) is dramatic. The only way to insure there is only a single marker with a given name is for them to be attached, so they aren't supported in Log4j. I have no idea what the usefulness of detached markers are other then to confuse the hell out of everyone and create multiple Markers with the same name and different hierarchies. SLF4J doesn't provide a way for a detached Marker to become attached and Log4j doesn't even support the concept.
        Hide
        Bruce Brouwer added a comment -
        • Well, my recommendation in using define is that it would actually change the parent hierarchy to be what is listed, so it is different than the current .getMarker(...). This way if I wasn't careful in my code, I wouldn't have to worry about the order of calls between .getMarker(name) and .define(name, parents...) (think if my markers are static member variables and I have little control over the order of class initialization). If getMarker(name) did somehow get called before define(name, parents...) they would both be pointing at the same marker and with the parent heirarchy I desired. I just now need to make sure that I only call define(name, parents...) once for each marker name.
        • Cool, thanks
        • I'm not talking about a cycle. I agree we need to avoid cycles. given A -> B -> C (A is child, B is parent, C is grand parent). If I add C to A, there is no cycle, it just is doubly parented on C. The cycle would happen if I tried to add A as a parent of C. The check parent.isInstanceOf(this) is completely valid and is what checks for the cycle. The check this.isInstanceOf(parent) is what I'm asking to change to only check immediate parents.
        • Cool, thanks
        • When I implemented this in my concept, I simply had the marker reference be null until it was attached. Once I add that detached marker to another attached marker, my reference to the detached marker should be considered invalid. Any use of it with the contains method against an attached marker should return false, even if there was an attached marker with the same name that was contained by the attached marker. When the marker is attached, I'm not asking to stop using the == check, I think that part is perfect. But I do understand that this makes the add method considerably more complicated, partly because it would need to check for cycles by checking parent names instead of ==, and partly because I might be attaching an entire parent hierarchy, not just a single marker. (I'm sure I'm just securing in your mind that it isn't worth it, aren't I?)
        Show
        Bruce Brouwer added a comment - Well, my recommendation in using define is that it would actually change the parent hierarchy to be what is listed, so it is different than the current .getMarker(...). This way if I wasn't careful in my code, I wouldn't have to worry about the order of calls between .getMarker(name) and .define(name, parents...) (think if my markers are static member variables and I have little control over the order of class initialization). If getMarker(name) did somehow get called before define(name, parents...) they would both be pointing at the same marker and with the parent heirarchy I desired. I just now need to make sure that I only call define(name, parents...) once for each marker name. Cool, thanks I'm not talking about a cycle. I agree we need to avoid cycles. given A -> B -> C (A is child, B is parent, C is grand parent). If I add C to A, there is no cycle, it just is doubly parented on C. The cycle would happen if I tried to add A as a parent of C. The check parent.isInstanceOf(this) is completely valid and is what checks for the cycle. The check this.isInstanceOf(parent) is what I'm asking to change to only check immediate parents. Cool, thanks When I implemented this in my concept, I simply had the marker reference be null until it was attached. Once I add that detached marker to another attached marker, my reference to the detached marker should be considered invalid. Any use of it with the contains method against an attached marker should return false, even if there was an attached marker with the same name that was contained by the attached marker. When the marker is attached, I'm not asking to stop using the == check, I think that part is perfect. But I do understand that this makes the add method considerably more complicated, partly because it would need to check for cycles by checking parent names instead of ==, and partly because I might be attaching an entire parent hierarchy, not just a single marker. (I'm sure I'm just securing in your mind that it isn't worth it, aren't I?)
        Hide
        Ralph Goers added a comment -
        • What you are proposing is a good idea but if it was done I wouldn't want a get method that could also create a Marker and a subsequent call to define for a Marker should throw an Exception. That said, I am reluctant to make this change since it isn't compatible with any existing code. Although we are not GA I am reluctant to make significant API changes at this point.
        • I understand what you are asking for about adding markers to different parts of the tree. I will consider that.
        • Once you have a Marker there is nothing preventing you from using it in a logging method. If it is detached it will always fail. What is the benefit of having a detached Marker? It seems to me it just creates problems for very little benefit, if any.
        Show
        Ralph Goers added a comment - What you are proposing is a good idea but if it was done I wouldn't want a get method that could also create a Marker and a subsequent call to define for a Marker should throw an Exception. That said, I am reluctant to make this change since it isn't compatible with any existing code. Although we are not GA I am reluctant to make significant API changes at this point. I understand what you are asking for about adding markers to different parts of the tree. I will consider that. Once you have a Marker there is nothing preventing you from using it in a logging method. If it is detached it will always fail. What is the benefit of having a detached Marker? It seems to me it just creates problems for very little benefit, if any.
        Hide
        Bruce Brouwer added a comment -
        • But we've made other breaking changes since RC1. Now, maybe they haven't broken the use of log4j-api, but changes have happened in log4j-core that could cause breakage. Should I make it a separate JIRA and see what others have to say?
        • Thanks
        • Honestly, I don't quite understand the point of detached markers. So I agree there is going to be very little benefit to this. And if someone finds a real use case that isn't handled, we can always fix it later.
        Show
        Bruce Brouwer added a comment - But we've made other breaking changes since RC1. Now, maybe they haven't broken the use of log4j-api, but changes have happened in log4j-core that could cause breakage. Should I make it a separate JIRA and see what others have to say? Thanks Honestly, I don't quite understand the point of detached markers. So I agree there is going to be very little benefit to this. And if someone finds a real use case that isn't handled, we can always fix it later.
        Hide
        Ralph Goers added a comment -

        You don't need another issue to make the change to the API. I would suggest you start a separate discussion on the dev list and we can update this based on what others say.

        We've made a couple of changes to the API but I don't recall if they were as significant as this.

        Show
        Ralph Goers added a comment - You don't need another issue to make the change to the API. I would suggest you start a separate discussion on the dev list and we can update this based on what others say. We've made a couple of changes to the API but I don't recall if they were as significant as this.
        Hide
        Ralph Goers added a comment -

        I've applied most of the changes in revision 1586864. Please review them. I haven't made the change to remove this.isInstanceOf(parent) yet as I haven't spent enough time thinking about that.

        Show
        Ralph Goers added a comment - I've applied most of the changes in revision 1586864. Please review them. I haven't made the change to remove this.isInstanceOf(parent) yet as I haven't spent enough time thinking about that.
        Hide
        Bruce Brouwer added a comment -
        • I like the new fluent API.
        • Do you think it will be confusing that we have one method called simply set, add or remove, while the getter is called getParents()? I'm thinking particularly about the set vs. getParents.
        • You synchronized add and remove, but not set. I know it isn't absolutely necessary, but might it cause confusing behavior if some race condition occurred from client code calling add and set concurrently? I doubt synchronizing set is going to cause serious performance problems.
        • Commentary: Alternatively, in getParents, you could simply return Arrays.copyOf(this.parents, this.parents.length); instead of creating a temporary array and calling System.arraycopy(...). But what you wrote works perfectly fine.
        • It looks like you implemented what I asked regarding this.isInstanceOf(parent) by implementing that contains method. I like it. Thanks.
        • I see you deprecated the getMarker methods that specify a parent. Thanks. Is it the plan to remove deprecated methods before 2.0 GA?
        Show
        Bruce Brouwer added a comment - I like the new fluent API. Do you think it will be confusing that we have one method called simply set, add or remove, while the getter is called getParents()? I'm thinking particularly about the set vs. getParents. You synchronized add and remove, but not set. I know it isn't absolutely necessary, but might it cause confusing behavior if some race condition occurred from client code calling add and set concurrently? I doubt synchronizing set is going to cause serious performance problems. Commentary: Alternatively, in getParents, you could simply return Arrays.copyOf(this.parents, this.parents.length); instead of creating a temporary array and calling System.arraycopy(...) . But what you wrote works perfectly fine. It looks like you implemented what I asked regarding this.isInstanceOf(parent) by implementing that contains method. I like it. Thanks. I see you deprecated the getMarker methods that specify a parent. Thanks. Is it the plan to remove deprecated methods before 2.0 GA?
        Hide
        Ralph Goers added a comment -
        • I renamed set to setParents.
        • I don't really see the point of synchronzing the setParents method. It won't really resolve any race conditions that happen with add or remove.
        • I changed to use Array.copyOf as you suggested.
        • All the deprecated methods should be removed before GA.

        Thanks for all your help.

        Show
        Ralph Goers added a comment - I renamed set to setParents. I don't really see the point of synchronzing the setParents method. It won't really resolve any race conditions that happen with add or remove. I changed to use Array.copyOf as you suggested. All the deprecated methods should be removed before GA. Thanks for all your help.

          People

          • Assignee:
            Unassigned
            Reporter:
            Bruce Brouwer
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development