OFBiz
  1. OFBiz
  2. OFBIZ-4282

TransactionUtil performance optimisations

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: SVN trunk
    • Fix Version/s: SVN trunk
    • Component/s: framework
    • Labels:

      Description

      Hello,
      Reviewing TransactionUtil code, I have seen 2 problems:

      • internalBegin is uselessly synchronized , since it is a static method it is a very big useless Contention Point since not unthread safe instance variable is used
      • debugResources is true which creates a DebugXAResource (that creates an Exception) , it should be false and made an option for debuging

      These 2 modifications have been in our production for a while and we noticed CPU reduction and no more contention on TransactionUtil#begin

      Regards
      Philippe Mouawad
      http://www.ubik-ingenierie.com

      There are no Sub-Tasks for this issue.

        Activity

        Hide
        Jacques Le Roux added a comment -

        Thanks Philippe, David,

        Philippe, yes it was a good idea to create sub-tasks. In such case just avoid to put same change in both/more patches (internalBegin in 4293 ;o)

        Show
        Jacques Le Roux added a comment - Thanks Philippe, David, Philippe, yes it was a good idea to create sub-tasks. In such case just avoid to put same change in both/more patches (internalBegin in 4293 ;o)
        Hide
        Philippe Mouawad added a comment -

        Hello,
        I created 2 subtasks (hope it's not a mistake), because I don't have time to do the configurable now, I will try to do it as soon as possible.

        Thank you both for your validation
        Regards
        Philippe

        Show
        Philippe Mouawad added a comment - Hello, I created 2 subtasks (hope it's not a mistake), because I don't have time to do the configurable now, I will try to do it as soon as possible. Thank you both for your validation Regards Philippe
        Hide
        Jacques Le Roux added a comment -

        Thanks David,

        It clarifies all.

        Philippe,

        Would you like to provide another patch? I'm OK to commit it.

        Show
        Jacques Le Roux added a comment - Thanks David, It clarifies all. Philippe, Would you like to provide another patch? I'm OK to commit it.
        Hide
        David E. Jones added a comment -
            • For #1:

        I did a little more research, and it looks like you're right Philippe, the synchronized is no longer needed. It was there originally because before JTA 1.1 the setTransactionTimeout method was defined like this:

        "Modify the timeout value that is associated with transactions started by subsequent invocations of the begin method."

        In JTA 1.1 the definition was changed to fix just this issue and now reads like this:

        "Modify the timeout value that is associated with transactions started by subsequent invocations of the begin method by the current thread."

        As long as the JTA implementation used follows this rule, then leaving the method unsynchronized should be fine.

            • For #2:

        I didn't say I'm against a configurable setting, like a properties file setting. I said I'm against changing the default in the code for all of OFBiz without making it configurable.

        On the topic of configuration: I'm against business-level configuration in properties files (that should go in the DB), I'm not against technical or system configuration in properties files, in fact that's just where it belongs.

        Show
        David E. Jones added a comment - For #1: I did a little more research, and it looks like you're right Philippe, the synchronized is no longer needed. It was there originally because before JTA 1.1 the setTransactionTimeout method was defined like this: "Modify the timeout value that is associated with transactions started by subsequent invocations of the begin method." In JTA 1.1 the definition was changed to fix just this issue and now reads like this: "Modify the timeout value that is associated with transactions started by subsequent invocations of the begin method by the current thread." As long as the JTA implementation used follows this rule, then leaving the method unsynchronized should be fine. For #2: I didn't say I'm against a configurable setting, like a properties file setting. I said I'm against changing the default in the code for all of OFBiz without making it configurable. On the topic of configuration: I'm against business-level configuration in properties files (that should go in the DB), I'm not against technical or system configuration in properties files, in fact that's just where it belongs.
        Hide
        Jacques Le Roux added a comment -

        First thanks to both of you, for explaining the problem in details and clearly give your POVs

        About debugResources (debug stack)
        We could easily create a properties for that. But I know David is a bit against adding more and more external paramaters. I must say OFBiz already "offers" a lot of them, and it's often the "easy"' solution. I see 2 contradictory aspects in this.

        1. If the parameters is well documented (API + Wiki: Technical Setup), it adds some value because it opens a door out of code.
        2. Tons of parameters kills parameters, even well documented. But then this does not mean that things should not change. It could be that among current parameters some are of little value (or even deprecated), when some could be added with more value (hence replacing/removing some of little value when the number begins to be a problem, but then legacy/history caught us)

        So as ever, we would need to find a consensus in the community. I'm know I'm an hopelessly optimist, but I hope so...

        For internalBegin, I guess David was suggesting to rather investigate screens in customized applications to set use-transaction="false" when no transactions are really needed. David, from your comment I wonder then if we should not have use-transaction="false" by default? Why the contrary was chosen?

        My 2 cts

        Show
        Jacques Le Roux added a comment - First thanks to both of you, for explaining the problem in details and clearly give your POVs About debugResources (debug stack) We could easily create a properties for that. But I know David is a bit against adding more and more external paramaters. I must say OFBiz already "offers" a lot of them, and it's often the "easy"' solution. I see 2 contradictory aspects in this. If the parameters is well documented (API + Wiki: Technical Setup), it adds some value because it opens a door out of code. Tons of parameters kills parameters, even well documented. But then this does not mean that things should not change. It could be that among current parameters some are of little value (or even deprecated), when some could be added with more value (hence replacing/removing some of little value when the number begins to be a problem, but then legacy/history caught us) So as ever, we would need to find a consensus in the community. I'm know I'm an hopelessly optimist, but I hope so... For internalBegin, I guess David was suggesting to rather investigate screens in customized applications to set use-transaction="false" when no transactions are really needed. David, from your comment I wonder then if we should not have use-transaction="false" by default? Why the contrary was chosen? My 2 cts
        Hide
        Philippe Mouawad added a comment -

        Again,
        I agree with you about useless transactions in some places but it is a more impacting change.
        So it would be located in ModelScreen#renderScreenString ?

        Regards
        Philippe

        Show
        Philippe Mouawad added a comment - Again, I agree with you about useless transactions in some places but it is a more impacting change. So it would be located in ModelScreen#renderScreenString ? Regards Philippe
        Hide
        Philippe Mouawad added a comment - - edited

        Hello M. Jones,
        First let met describe to you how/when I propose an optimisation:
        1) We track down in our production CPU and Contention points
        2) We make thread dumps
        Then analyse them.
        We make a load test on a platform that mimics the production with only the change and track response times, CPU and monitor changes.
        We also profile ofbiz on load test platform with some well known Profiling tools I won't put here.

        Then to answer your remarks:
        1) I am sorry but I don't agree on the fact that making the method not synchronized, the timeout for a transaction would bleed over into other transaction, timeout is a parameter so local to the call, synchro would be required if timeout was instance variable (and it wouldn't anyway protect from bleed) or static without being thread local. I think synchronized was here at the start because there were some instance and static variables in TransactionUtil that were not thread safe, but there were some contributions (one of mine if I remember well ?) that fixed the issues to make them Thread Local or use synchronized Collections so this is not more useful. This modification is in production for more than a year now and Contention point was solved.

        2) I agree with you that this feature is helpful for finding the source of transaction timeout BUT consider its cost on a web site with 700 Transaction Hits/s (which is my case), we get 700 Exceptions created each second just for a potential issue, that's why I was saying that disabling it should be the default and proposing method to enable it thrhough webadmin once you have timeout transaction or through a property is better. Once again at the time we saw CPU on production servers drop from 5%.

        Anyway I repeat that these 2 fixes were HARDLY tested on preproduction then production (where synchronisation issues have the best chance to occur) since a year now.

        Regards
        Philippe

        Show
        Philippe Mouawad added a comment - - edited Hello M. Jones, First let met describe to you how/when I propose an optimisation: 1) We track down in our production CPU and Contention points 2) We make thread dumps Then analyse them. We make a load test on a platform that mimics the production with only the change and track response times, CPU and monitor changes. We also profile ofbiz on load test platform with some well known Profiling tools I won't put here. Then to answer your remarks: 1) I am sorry but I don't agree on the fact that making the method not synchronized, the timeout for a transaction would bleed over into other transaction, timeout is a parameter so local to the call, synchro would be required if timeout was instance variable (and it wouldn't anyway protect from bleed) or static without being thread local. I think synchronized was here at the start because there were some instance and static variables in TransactionUtil that were not thread safe, but there were some contributions (one of mine if I remember well ?) that fixed the issues to make them Thread Local or use synchronized Collections so this is not more useful. This modification is in production for more than a year now and Contention point was solved. 2) I agree with you that this feature is helpful for finding the source of transaction timeout BUT consider its cost on a web site with 700 Transaction Hits/s (which is my case), we get 700 Exceptions created each second just for a potential issue, that's why I was saying that disabling it should be the default and proposing method to enable it thrhough webadmin once you have timeout transaction or through a property is better. Once again at the time we saw CPU on production servers drop from 5%. Anyway I repeat that these 2 fixes were HARDLY tested on preproduction then production (where synchronisation issues have the best chance to occur) since a year now. Regards Philippe
        Hide
        David E. Jones added a comment -

        Philippe: did you consider the down-sides to these changes? It's fine if you want to make these changes locally, but consider undesirable behavior if they were the default:

        1. If internalBegin is not synchronized the timeout for a transaction would bleed over into other transactions; unfortunately this is a major weakness with the JTA API: you can only change timeout for the entire transaction manager, you can't set it for a single transaction.

        2. If the debug stack is not maintained it makes it harder to track down transaction and various database-related issues, even in production. If you haven't found it to be useful, then again, it's fine to change locally but it is valuable information to have when tracking down bad code or certain lower-level problems. Also, have you done performance measurements with and without to see how much of a change there is?

        About the synchronization problem, that is a significant performance issue that I have seen while working with clients. However, the real cause of the problem is higher up because by default all screens are run within a transaction, and really it's pretty rare that a screen needs to be run in a transaction, so that is where a change should be made.

        Either way, performance "improvements" without metrics and tracking down the main cause of problems is likely to produce as many problems as it solves.

        Based on that, I'd say these changes should NOT be committed.

        Show
        David E. Jones added a comment - Philippe: did you consider the down-sides to these changes? It's fine if you want to make these changes locally, but consider undesirable behavior if they were the default: 1. If internalBegin is not synchronized the timeout for a transaction would bleed over into other transactions; unfortunately this is a major weakness with the JTA API: you can only change timeout for the entire transaction manager, you can't set it for a single transaction. 2. If the debug stack is not maintained it makes it harder to track down transaction and various database-related issues, even in production. If you haven't found it to be useful, then again, it's fine to change locally but it is valuable information to have when tracking down bad code or certain lower-level problems. Also, have you done performance measurements with and without to see how much of a change there is? About the synchronization problem, that is a significant performance issue that I have seen while working with clients. However, the real cause of the problem is higher up because by default all screens are run within a transaction, and really it's pretty rare that a screen needs to be run in a transaction, so that is where a change should be made. Either way, performance "improvements" without metrics and tracking down the main cause of problems is likely to produce as many problems as it solves. Based on that, I'd say these changes should NOT be committed.
        Hide
        Philippe Mouawad added a comment -

        Hello Karl,
        I commented on OFBIZ-1029

        Show
        Philippe Mouawad added a comment - Hello Karl, I commented on OFBIZ-1029
        Hide
        Karl Eilebrecht added a comment -

        Hi Philippe,

        I saw that you're trying to improve TransactionUtil.
        Did you review the following issue?
        https://issues.apache.org/jira/browse/OFBIZ-1029
        Some years ago I refactored the transaction handling completely for my company
        because we had some requirements and could not live with some of the
        (maybe still today) contained limitations and bugs. However the proposal was never
        reviewed, yet, and we then "developed away" a little from standard Ofbiz.
        But I can tell you, that we are using this code for years now without
        any problems, fast and reliable.

        Here is a list what I did:

        • problem suspending transaction in status rollback
        • improved state stack for begin-suspend-resume ... commit/rollback cycle (original implementation loses information)
        • improved (less) synchronization
        • improved timestamp creation
        • introduced transaction id (for logging/debugging)
        • fixed bug in cleanSuspendedTransactions (cleanup should not abort after single rollback error)
        • added comments/API-doc

        Maybe you could take a look at it and include some ideas into
        your work.

        Regards and good luck!
        Karl

        Show
        Karl Eilebrecht added a comment - Hi Philippe, I saw that you're trying to improve TransactionUtil. Did you review the following issue? https://issues.apache.org/jira/browse/OFBIZ-1029 Some years ago I refactored the transaction handling completely for my company because we had some requirements and could not live with some of the (maybe still today) contained limitations and bugs. However the proposal was never reviewed, yet, and we then "developed away" a little from standard Ofbiz. But I can tell you, that we are using this code for years now without any problems, fast and reliable. Here is a list what I did: problem suspending transaction in status rollback improved state stack for begin-suspend-resume ... commit/rollback cycle (original implementation loses information) improved (less) synchronization improved timestamp creation introduced transaction id (for logging/debugging) fixed bug in cleanSuspendedTransactions (cleanup should not abort after single rollback error) added comments/API-doc Maybe you could take a look at it and include some ideas into your work. Regards and good luck! Karl
        Hide
        Philippe Mouawad added a comment -

        Patch that fixes the issue

        Show
        Philippe Mouawad added a comment - Patch that fixes the issue

          People

          • Assignee:
            Jacques Le Roux
            Reporter:
            Philippe Mouawad
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development