ODE
  1. ODE
  2. ODE-876

ODE 1.3.4 is 10 times slower than ODE 1.3.3

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.3.3
    • Fix Version/s: 1.3.5
    • Component/s: BPEL Runtime
    • Labels:
      None
    • Environment:
      ODE 1.3.4 JBI, Fuse 3.4.0.5, ODE database in memory

      Description

      We've recently migrated from ODE 1.2 JBI to ODE 1.3.4 JBI.

      We were unable to migrate earlier due to criticial bugs in JBI packages in 1.3.3 and 1.3.2..

      We ran the same performance test against new ODE 1.3.4 and previous ODE 1.2
      The same configuration, the same environment, the same test case, only different ODE zip...

      For example:
      A service which has about 7 invokes bpel activities with ODE 1.3.4 spends on processing 5500 ms
      with ODE 1.2 600ms.. The timings spend on these 7 services doesn't change. JBI bus time doesn;t change. Nothing has changed except ODE version.

      It happens the same for our all orchestrated services.

      The only thing which was changed is ODE version.

      I can't provide the test itself (prioprietary) but I can provide configuration.
      Our services are only in memory.
      ODE database is JBI external and it is Derby in memory.

      However the results are shocking. We of course abandon ODE 1.3.4 ans stay with ODE 1.2, but if it is not fixed, we may abandon ODE completely.

      1. ping-pong-sa-1.0.0.zip
        32 kB
        Rafal Rusin
      2. ping-pong-load-test-soapui-project.xml
        6 kB
        Rafal Rusin
      3. OdeProfiling.pdf
        364 kB
        Mateusz Nowakowski
      4. ODE-876-2.patch
        8 kB
        Tammo van Lessen
      5. ODE-876.patch
        4 kB
        Tammo van Lessen
      6. ODE1.3.4WithODEPath.pdf
        302 kB
        Mateusz Nowakowski
      7. ode1.3.4.3Perf.pdf
        237 kB
        Mateusz Nowakowski
      8. ODE_performanceComparison.pdf
        303 kB
        Mateusz Nowakowski

        Issue Links

          Activity

          Hide
          Rafal Rusin added a comment -

          Please provide sample process and soapui load test, so we can check it out.

          Regards

          Show
          Rafal Rusin added a comment - Please provide sample process and soapui load test, so we can check it out. Regards
          Hide
          Mateusz Nowakowski added a comment -

          As I said I can't provide the test or process, because it is JBI based, so I would have to provide all JBI endpoints as well.

          Is there any example of any load test (process and soapui test) in ODE JBI distibution?

          It is so visible, so any load test should prove that.

          Show
          Mateusz Nowakowski added a comment - As I said I can't provide the test or process, because it is JBI based, so I would have to provide all JBI endpoints as well. Is there any example of any load test (process and soapui test) in ODE JBI distibution? It is so visible, so any load test should prove that.
          Hide
          Rafal Rusin added a comment -

          I know you can't provide. However I don't think there's other way than you to prepare test case, which is slow and then we can work on it. You can use ping pong example (https://svn.apache.org/repos/asf/ode/trunk/distro/src/examples-jbi/maven2/ping-pong/).

          Show
          Rafal Rusin added a comment - I know you can't provide. However I don't think there's other way than you to prepare test case, which is slow and then we can work on it. You can use ping pong example ( https://svn.apache.org/repos/asf/ode/trunk/distro/src/examples-jbi/maven2/ping-pong/ ).
          Hide
          Rafal Rusin added a comment - - edited

          I did performance comparison for ODE 1.2 and 1.3.4 JBI for 18 invokes (communication between ping and pong is over EIP - Seda Queue).
          Test was run on 10 threads.
          Please note that test case needs to be run twice, because JVM optimizes some functions and TPS grows.

          Results are:
          ODE 1.2: 11.33 TPS
          ODE 1.3.4: 10.82 TPS

          So performance drop is around 5%.
          That's why I need a particular test case.

          Regards

          Show
          Rafal Rusin added a comment - - edited I did performance comparison for ODE 1.2 and 1.3.4 JBI for 18 invokes (communication between ping and pong is over EIP - Seda Queue). Test was run on 10 threads. Please note that test case needs to be run twice, because JVM optimizes some functions and TPS grows. Results are: ODE 1.2: 11.33 TPS ODE 1.3.4: 10.82 TPS So performance drop is around 5%. That's why I need a particular test case. Regards
          Hide
          Mateusz Nowakowski added a comment -

          The PingPong is a nice startup point to provide the example. So I'll try but it may take some time.

          The main difference between PingPong and my services is that my services invokes a XSL2 transformations before and after invoke. The payloads are bigger than in PingPong, for example 20 KB.
          What is more each invoke takes about 50-100ms to process so it may have impact on thread scheduling.
          GC parameters are fixed (for young and old gen) however it may change during the test but not significantly, because gc full scans are rare.

          I do not have Saxon on global SMX classpath so ODE uses own Saxon from own service engine. Saxon version has changed, so it may be this.
          I'll repeat the test with old Saxon and let you know.

          Show
          Mateusz Nowakowski added a comment - The PingPong is a nice startup point to provide the example. So I'll try but it may take some time. The main difference between PingPong and my services is that my services invokes a XSL2 transformations before and after invoke. The payloads are bigger than in PingPong, for example 20 KB. What is more each invoke takes about 50-100ms to process so it may have impact on thread scheduling. GC parameters are fixed (for young and old gen) however it may change during the test but not significantly, because gc full scans are rare. I do not have Saxon on global SMX classpath so ODE uses own Saxon from own service engine. Saxon version has changed, so it may be this. I'll repeat the test with old Saxon and let you know.
          Hide
          Mateusz Nowakowski added a comment -

          Hi,

          We have finally some time and resources to work on that. (limited of course)

          First of all we did two thinks:
          1) retest it in different environment, with small amount of required TPS to check how big traffic is required to make it visible;

          Test plan and results attached.

          It is seen even with 5TPS.
          For 10 TPS the response time is 8-10 times bigger for ODE 1.3.4. So it doesn't require big traffic.
          CPU utilization is bigger 5-10 %
          Memory utilization 2-3 time bigger (measured as the amount of FullGCs)

          2) Fallback Saxon,
          It is not easy as expected, after fallback the processes don't work.

          Our plan is to:
          1) run some Java profilers
          2) static code analysis.. it can be tough
          3) write some project from scratch under Fuse/SMX3 and give you as an example.

          Show
          Mateusz Nowakowski added a comment - Hi, We have finally some time and resources to work on that. (limited of course) First of all we did two thinks: 1) retest it in different environment, with small amount of required TPS to check how big traffic is required to make it visible; Test plan and results attached. It is seen even with 5TPS. For 10 TPS the response time is 8-10 times bigger for ODE 1.3.4. So it doesn't require big traffic. CPU utilization is bigger 5-10 % Memory utilization 2-3 time bigger (measured as the amount of FullGCs) 2) Fallback Saxon, It is not easy as expected, after fallback the processes don't work. Our plan is to: 1) run some Java profilers 2) static code analysis.. it can be tough 3) write some project from scratch under Fuse/SMX3 and give you as an example.
          Hide
          Mateusz Nowakowski added a comment -

          We've found what was the issue.

          We did Java profiling and found that JaxpVariableResolver.getSimpleContent(...) run exponentialy longer than the previous version.
          See attached ODEProfiling.pdf

          ODE-759 which was applied in 1.3.4 introduced a change in org.apache.ode.bpel.elang.xpath20.runtime.JaxpVariableResolver
          that SaxonConfiguration is created each time JaxpVariableResolver.getSimpleContent(...) is invoked, it was the cause of all evil.

          We've applied simple patch (in ODEProfiling.pdf) and the results are even better than ODE 1.2.

          Now I have two requests:

          • Include the fix and release 1.3.5.RC which is exactly 1.3.4 + only this fix.
            Give us two weeks to bless 1.3.5.RC and release 1.3.5.
            We've lost several months (and money) due to this issue and we can't wait another months to some new release which can contains another bugs.

          We would like to have it done ASAP.

          • Introduce real word examples and real world performance tests.
          Show
          Mateusz Nowakowski added a comment - We've found what was the issue. We did Java profiling and found that JaxpVariableResolver.getSimpleContent(...) run exponentialy longer than the previous version. See attached ODEProfiling.pdf ODE-759 which was applied in 1.3.4 introduced a change in org.apache.ode.bpel.elang.xpath20.runtime.JaxpVariableResolver that SaxonConfiguration is created each time JaxpVariableResolver.getSimpleContent(...) is invoked, it was the cause of all evil. We've applied simple patch (in ODEProfiling.pdf) and the results are even better than ODE 1.2. Now I have two requests: Include the fix and release 1.3.5.RC which is exactly 1.3.4 + only this fix. Give us two weeks to bless 1.3.5.RC and release 1.3.5. We've lost several months (and money) due to this issue and we can't wait another months to some new release which can contains another bugs. We would like to have it done ASAP. Introduce real word examples and real world performance tests.
          Hide
          Jeff Yu added a comment -

          Thanks a lot for this profiling.

          But if you want to have 1.3.5 release = 1.3.4 + this fix, IMHO I believe you can check out the 1.3.4 branch code at: http://svn.apache.org/repos/asf/ode/branches/APACHE_ODE_1.3.4.X/ or http://svn.apache.org/repos/asf/ode/tags/APACHE_ODE_1.3.4/, and then apply this fix into it, and then do the build and release by your own, this will be the quickest way to meet your demand.

          What we will do is that we will include your fix into our ODE upstream versions, like 1.3.5.

          HTH
          Jeff

          Show
          Jeff Yu added a comment - Thanks a lot for this profiling. But if you want to have 1.3.5 release = 1.3.4 + this fix, IMHO I believe you can check out the 1.3.4 branch code at: http://svn.apache.org/repos/asf/ode/branches/APACHE_ODE_1.3.4.X/ or http://svn.apache.org/repos/asf/ode/tags/APACHE_ODE_1.3.4/ , and then apply this fix into it, and then do the build and release by your own, this will be the quickest way to meet your demand. What we will do is that we will include your fix into our ODE upstream versions, like 1.3.5. HTH Jeff
          Hide
          Mateusz Nowakowski added a comment -

          That's what I did, but we would like to have new release as fast as possible.
          We can't work on homemade ODE, but on official release.
          So when can we expect 1.3.5?

          Show
          Mateusz Nowakowski added a comment - That's what I did, but we would like to have new release as fast as possible. We can't work on homemade ODE, but on official release. So when can we expect 1.3.5?
          Hide
          Tammo van Lessen added a comment -

          Hi Mateusz,

          thanks for your detailed bug report and the patch. Your work is much appreciated. I also think this issue is reason enough for a 1.3.5 release, also given the fact that our current trunk is more going towards a 1.4 release (especially if we include the JPA refactoring). Are there other issues that have been fixed in trunk that should go into 1.3.5?

          Tammo

          Show
          Tammo van Lessen added a comment - Hi Mateusz, thanks for your detailed bug report and the patch. Your work is much appreciated. I also think this issue is reason enough for a 1.3.5 release, also given the fact that our current trunk is more going towards a 1.4 release (especially if we include the JPA refactoring). Are there other issues that have been fixed in trunk that should go into 1.3.5? Tammo
          Hide
          Tammo van Lessen added a comment -

          Oh, and another question: Mateusz, during profiling, did you experience other performance bottlenecks that we should be aware of or that should be taken care of?

          Thanks,
          Tammo

          Show
          Tammo van Lessen added a comment - Oh, and another question: Mateusz, during profiling, did you experience other performance bottlenecks that we should be aware of or that should be taken care of? Thanks, Tammo
          Hide
          Mateusz Nowakowski added a comment -

          >Are there other issues that have been fixed in trunk that should go into 1.3.5?
          To be honest, we didn't search other bottlenecks. We've found it, fixed it and tested. The results are similar to ODE 1.2, even slightly better than 1.2, so we stopped further testing, because the results are acceptable for us.

          >Are there other issues that have been fixed in trunk that should go into 1.3.5?
          To be honest, we spent some time to make sure that 1.3.4 is compatible with 1.2 for functional point of view for our existing processes.
          Now we are almost sure that from performance point of view it works similarly, so we don't want revolutionary changes in 1.3.5.

          The previous three releases of JBI ODE didn't work for us (for example JIRAS: ODE-603, ODE-604, ODE-605, ODE-651, ODE-806, ODE-840 and now ODE-876), so we are still on ODE 1.2 and now we are extremely cautious.

          When you are ready to release 1.3.5, please release it as RC, freeze the code, so that we will have about 2 weeks to verify that it works for us.
          We need to perform some bigger, "official" tests and we have to find "free" slots for that.

          Show
          Mateusz Nowakowski added a comment - >Are there other issues that have been fixed in trunk that should go into 1.3.5? To be honest, we didn't search other bottlenecks. We've found it, fixed it and tested. The results are similar to ODE 1.2, even slightly better than 1.2, so we stopped further testing, because the results are acceptable for us. >Are there other issues that have been fixed in trunk that should go into 1.3.5? To be honest, we spent some time to make sure that 1.3.4 is compatible with 1.2 for functional point of view for our existing processes. Now we are almost sure that from performance point of view it works similarly, so we don't want revolutionary changes in 1.3.5. The previous three releases of JBI ODE didn't work for us (for example JIRAS: ODE-603 , ODE-604 , ODE-605 , ODE-651 , ODE-806 , ODE-840 and now ODE-876 ), so we are still on ODE 1.2 and now we are extremely cautious. When you are ready to release 1.3.5, please release it as RC, freeze the code, so that we will have about 2 weeks to verify that it works for us. We need to perform some bigger, "official" tests and we have to find "free" slots for that.
          Hide
          Tammo van Lessen added a comment -

          Hi Mateusz,

          I have reviewed your patch. The constructor actually overrides the static field each time it gets called, so IMO this is a potential race condition. Also the XQuery expression runtime needs a customized Configuration object. I slightly changed the patch, could you please review and test it briefly? The bottleneck should be removed anyways, instead of creating a new Configuration object each time, I'm using the one that is passed in by the runtime.

          Thanks,
          Tammo

          Show
          Tammo van Lessen added a comment - Hi Mateusz, I have reviewed your patch. The constructor actually overrides the static field each time it gets called, so IMO this is a potential race condition. Also the XQuery expression runtime needs a customized Configuration object. I slightly changed the patch, could you please review and test it briefly? The bottleneck should be removed anyways, instead of creating a new Configuration object each time, I'm using the one that is passed in by the runtime. Thanks, Tammo
          Hide
          Mateusz Nowakowski added a comment -

          We'll retest the patch and let you know soon.

          Show
          Mateusz Nowakowski added a comment - We'll retest the patch and let you know soon.
          Hide
          Mateusz Nowakowski added a comment -

          We've applied the patch and the results are attached (ODE1.3.4WithODEPath.pdf).

          The results are slightly worse from timing perspective and there are twice more FullGC during the same timeframe.

          I'm wondering how many Saxon configurations are used during processing one transaction?

          I don't know the architecture of the ODE, but I think that there should be only one Configratuon object per BPEL service.

          Show
          Mateusz Nowakowski added a comment - We've applied the patch and the results are attached (ODE1.3.4WithODEPath.pdf). The results are slightly worse from timing perspective and there are twice more FullGC during the same timeframe. I'm wondering how many Saxon configurations are used during processing one transaction? I don't know the architecture of the ODE, but I think that there should be only one Configratuon object per BPEL service.
          Hide
          Mateusz Nowakowski added a comment -

          PS. the line
          __log.debug("getSimpleContent for " + DOMUtils.domToString(simpleNode) + " " + type + " returned " + o);
          in JaxpVariableResolver
          should be covered by isDebugEnabled as well

          Show
          Mateusz Nowakowski added a comment - PS. the line __log.debug("getSimpleContent for " + DOMUtils.domToString(simpleNode) + " " + type + " returned " + o); in JaxpVariableResolver should be covered by isDebugEnabled as well
          Hide
          Mateusz Nowakowski added a comment -

          XPath20ExpressionRuntime

          and its method
          private Object evaluate(OExpression cexp, EvaluationContext ctx, QName type) throws FaultException, EvaluationException {
          creates each time
          XPathFactory xpf = new XPathFactoryImpl();
          which creates each time Saxon Configuration inside.

          Shouldn't xpf be created only once and the resolvers were added after xpe is created?

          Show
          Mateusz Nowakowski added a comment - XPath20ExpressionRuntime and its method private Object evaluate(OExpression cexp, EvaluationContext ctx, QName type) throws FaultException, EvaluationException { creates each time XPathFactory xpf = new XPathFactoryImpl(); which creates each time Saxon Configuration inside. Shouldn't xpf be created only once and the resolvers were added after xpe is created?
          Hide
          Mateusz Nowakowski added a comment -

          JaxpVariableResolver.java
          The line:
          Document doc = (simpleNode instanceof Document) ? ((Document) simpleNode) : simpleNode.getOwnerDocument();
          is useless.
          The doc is never read.

          Show
          Mateusz Nowakowski added a comment - JaxpVariableResolver.java The line: Document doc = (simpleNode instanceof Document) ? ((Document) simpleNode) : simpleNode.getOwnerDocument(); is useless. The doc is never read.
          Hide
          Tammo van Lessen added a comment -

          Hi Mateusz,

          I've created a new patch with your suggestions. The XPathFactory is now a field so that the configuration object is created only once per process model. Also I've added a couple of isDebugEnabled() checks.

          Could you please test it again? I think we're coming closer.

          Thanks,
          Tammo

          Show
          Tammo van Lessen added a comment - Hi Mateusz, I've created a new patch with your suggestions. The XPathFactory is now a field so that the configuration object is created only once per process model. Also I've added a couple of isDebugEnabled() checks. Could you please test it again? I think we're coming closer. Thanks, Tammo
          Hide
          Mateusz Nowakowski added a comment -

          Attached ode1.3.4.3Perf.pdf which is a comparison between ODE 1.2, our fix and 1.3.4.3 which is the latest patch.

          The results are satisfying, it is better than ODE 1.2 but still slightly worse than the "static" configuration patch (to be honest I don't know why it is still better).

          In conclusion the latest patch is ok, but there are plenty of places that code can be improved.

          Please start focusing on performance.

          For example there are still places where: (examples come from XPath20ExpressionRuntime)

                      result = (List) someRes;
                      __log.debug("Returned list of size " + result.size());
          

          or:

          Node contextNode = ctx.getRootNode() == null ? DOMUtils.newDocument() : ctx.getRootNode();
          

          it can be replaced by:

                      Node contextNode = ctx.getRootNode();
                      if (contextNode == null){
                      	contextNode = DOMUtils.newDocument();
                      }
          

          or

              static String generateMessageExchangeIdentifier(String partnerlinkName, String operationName) {
                  StringBuffer sb = new StringBuffer(partnerlinkName);
                  sb.append('.');
                  sb.append(operationName);
                  return sb.toString();
              }
          

          StringBuffer can be replaced by StringBuilder

          Show
          Mateusz Nowakowski added a comment - Attached ode1.3.4.3Perf.pdf which is a comparison between ODE 1.2, our fix and 1.3.4.3 which is the latest patch. The results are satisfying, it is better than ODE 1.2 but still slightly worse than the "static" configuration patch (to be honest I don't know why it is still better). In conclusion the latest patch is ok, but there are plenty of places that code can be improved. Please start focusing on performance. For example there are still places where: (examples come from XPath20ExpressionRuntime) result = (List) someRes; __log.debug( "Returned list of size " + result.size()); or: Node contextNode = ctx.getRootNode() == null ? DOMUtils.newDocument() : ctx.getRootNode(); it can be replaced by: Node contextNode = ctx.getRootNode(); if (contextNode == null ){ contextNode = DOMUtils.newDocument(); } or static String generateMessageExchangeIdentifier( String partnerlinkName, String operationName) { StringBuffer sb = new StringBuffer (partnerlinkName); sb.append('.'); sb.append(operationName); return sb.toString(); } StringBuffer can be replaced by StringBuilder
          Hide
          Hudson added a comment -

          Integrated in ODE-trunk-jdk6 #382 (See https://hudson.apache.org/hudson/job/ODE-trunk-jdk6/382/)
          ODE-876: performance improvements, thanks to Mateusz Nowakowski!

          Show
          Hudson added a comment - Integrated in ODE-trunk-jdk6 #382 (See https://hudson.apache.org/hudson/job/ODE-trunk-jdk6/382/ ) ODE-876 : performance improvements, thanks to Mateusz Nowakowski!
          Hide
          Hudson added a comment -

          Integrated in ODE-trunk #393 (See https://hudson.apache.org/hudson/job/ODE-trunk/393/)
          ODE-876: performance improvements, thanks to Mateusz Nowakowski!

          Show
          Hudson added a comment - Integrated in ODE-trunk #393 (See https://hudson.apache.org/hudson/job/ODE-trunk/393/ ) ODE-876 : performance improvements, thanks to Mateusz Nowakowski!
          Hide
          Mateusz Nowakowski added a comment -

          We did more certified tests with ODE 1.3.4 + the latest patch and results are satisfying as well.

          When can we expect ODE 1.3.5.RC ?

          Show
          Mateusz Nowakowski added a comment - We did more certified tests with ODE 1.3.4 + the latest patch and results are satisfying as well. When can we expect ODE 1.3.5.RC ?
          Hide
          René Bos added a comment -

          Tammo, you were asking for patches that should be included in Ode 1.3.5. Please include https://issues.apache.org/jira/browse/ODE-787 because its blocking for PostgreSQL users.

          I agree with Mateusz., when can we expect a RC? We are setting up a new test and production environment and want to upgrade to a new Ode version as well.

          Thanks,
          René

          Show
          René Bos added a comment - Tammo, you were asking for patches that should be included in Ode 1.3.5. Please include https://issues.apache.org/jira/browse/ODE-787 because its blocking for PostgreSQL users. I agree with Mateusz., when can we expect a RC? We are setting up a new test and production environment and want to upgrade to a new Ode version as well. Thanks, René
          Hide
          Tammo van Lessen added a comment -

          Hi Mateusz, René,

          I'm almost done with the work on 1.3.5 - I still need some polishing and some checks on the build system and need to review ODE-901 and ODE-787 to see if it is possible to include them.

          I plan to cut a RC as soon as possible, will upload it so a nexus staging repository and will then share a link with you so that you can download it and run your tests. Your help in this process is much appreciated.

          Show
          Tammo van Lessen added a comment - Hi Mateusz, René, I'm almost done with the work on 1.3.5 - I still need some polishing and some checks on the build system and need to review ODE-901 and ODE-787 to see if it is possible to include them. I plan to cut a RC as soon as possible, will upload it so a nexus staging repository and will then share a link with you so that you can download it and run your tests. Your help in this process is much appreciated.

            People

            • Assignee:
              Tammo van Lessen
              Reporter:
              Mateusz Nowakowski
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development