Uploaded image for project: 'Wicket'
  1. Wicket
  2. WICKET-6356

Clustering failover not working on Tomcat

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 6.17.0, 7.5.0
    • Fix Version/s: 6.27.0, 7.7.0, 8.0.0-M6
    • Component/s: wicket
    • Environment:
      Tomcat 8.0.32, Apache 2.4, OSx and Solaris 9,

      Description

      Clustering failover with Tomcat 8 is not working for us in Production or locally on my Mac. The reason is the following from the debugging that I did and from the best of my understanding:

      Whenever a statefull page is used/touched the SessionEntry is updated. This SessionEntry is stored in the tomcats session under the attribute name "wicket:persistentPageManagerData-APPLICATION_NAME".
      However at the end of the wicket request, Session.internalDetach() is called and setAttribute() is called passing the wicket session under the name "wicket:wicket.hub:session". This triggers the replication in Tomcat (only when setAttribute is called and ONLY that attribute is replicated). Because the SessionEntry is stored under a different attribute name its never replicated.

      The only time the SessionEntry is replicated is when a node first starts up and joins the cluster , at this point all sessions are replicated across (including all attritbutes) by the daltamanager and SessionEntry#readObject() is used which contains all the pages. On Tomcat after this initial syncing of all sessions SessionEntry#readObject() is never used.

      So the only time session failover works is when you kill the other instance (in a 2 node cluster) just after the other instance starts up - at this point the correct SessionEntry is on the new instance. If however you visit some more pages the new pages are never replicated across as the SessionEntry is never replicated.

      Further to this IF the SessionEntry was to be replicated it would not be any good as the cache are transient
      private transient List<IManageablePage> sessionCache;
      private transient List<Object> afterReadObject

      This means any new pages created on the old instance (after the new instance has started up) are not available in the http session or the second level page store on the new instance.
      Therefore when the old instance in shut down the user is load balanced to the new instance. At this point the link in the page Wicket is looking for does not exist in the SessionEntry cache or the PageStore so it creates the page and looks for the component/link.This causes a ComponentNotFoundException for us because the links are either in a DataView which is never rendered so does not exist, or the other links are actually added to the page in an Ajax request and again because the page is not rendered are not there, Wicket then throws the exception and it appears to the user the session is lost.

      So in summary there seems there is no way for the current mechanism on Tomcat to work. It would seem the SessionEntry.sessionCache needs to be not transient and setAttribute needs to be called for the SessionEntry at the end of request on the internalDetach so that Tomcats deltamanager replicates that attribute in the session.

      Attached my quickstart , tomcat, and apache conf.

      1. PageStoreManager.java
        10 kB
        wayne pope
      2. IssueFiles.zip
        34 kB
        wayne pope

        Issue Links

          Activity

          Hide
          bitstorm Andrea Del Bene added a comment -

          Hi all,

          if there are no urgent issues to close I can start the release vote for 7.8.0 in the very next days.

          Show
          bitstorm Andrea Del Bene added a comment - Hi all, if there are no urgent issues to close I can start the release vote for 7.8.0 in the very next days.
          Hide
          waynegc wayne pope added a comment -

          Thanks Martin much appreciated for pointing that out. I think its best I wait then until the release of 7.8.0. Any idea/guess when that will go live?

          Show
          waynegc wayne pope added a comment - Thanks Martin much appreciated for pointing that out. I think its best I wait then until the release of 7.8.0. Any idea/guess when that will go live?
          Hide
          mgrigorov Martin Grigorov added a comment -

          Great news!
          Just make sure you also use the fix for WICKET-6387 !

          Show
          mgrigorov Martin Grigorov added a comment - Great news! Just make sure you also use the fix for WICKET-6387 !
          Hide
          waynegc wayne pope added a comment -

          Hello Martin,

          sincere apologies for the amount of time its taken me to circle back to this, but we had some major releases that needed my attention. I'm very glad to say that your fixes have solved the replication issue at least in testing. I will get this moved to production, but the outcome looks great. Many many thanks for working on this.

          Show
          waynegc wayne pope added a comment - Hello Martin, sincere apologies for the amount of time its taken me to circle back to this, but we had some major releases that needed my attention. I'm very glad to say that your fixes have solved the replication issue at least in testing. I will get this moved to production, but the outcome looks great. Many many thanks for working on this.
          Hide
          mgrigorov Martin Grigorov added a comment -

          I was able to reproduce the problem.
          I've just added the following Apache2 conf to /etc/apache2/sites-enabled/tomcat-cluster.conf:

          ProxyPass "/" "balancer://cluster/" stickysession=JSESSIONID nofailover=Off
          ProxyPassReverse "/" "balancer://cluster/"
          
          # define the balancer, with http and/or ajp connections
          <Proxy balancer://cluster/>
                 Order allow,deny
                 Allow from all
                 BalancerMember ajp://127.0.0.1:8009 route=gc1
                 BalancerMember ajp://127.0.0.1:8010 route=gc2
            </Proxy>
          

          The needed modules are: proxy_ajp.load, proxy_balancer.conf, proxy_balancer.load, proxy.conf, proxy_http.load, proxy.load,
          lbmethod_byrequests.load

          The version of Tomcat is 8.0.43. For 8.5.x MessageDispatch15Interceptor should be replaced with MessageDispatchInterceptor in server.xonf.
          Also I've changed the receiver address to 127.0.0.1.

          wayne pope Please test my change with 7.7.0-SNAPSHOT and let me know if something is still broken!
          Thank you very much for the discussion and analysis!

          Show
          mgrigorov Martin Grigorov added a comment - I was able to reproduce the problem. I've just added the following Apache2 conf to /etc/apache2/sites-enabled/tomcat-cluster.conf: ProxyPass "/" "balancer: //cluster/" stickysession=JSESSIONID nofailover=Off ProxyPassReverse "/" "balancer: //cluster/" # define the balancer, with http and/or ajp connections <Proxy balancer: //cluster/> Order allow,deny Allow from all BalancerMember ajp: //127.0.0.1:8009 route=gc1 BalancerMember ajp: //127.0.0.1:8010 route=gc2 </Proxy> The needed modules are: proxy_ajp.load, proxy_balancer.conf, proxy_balancer.load, proxy.conf, proxy_http.load, proxy.load, lbmethod_byrequests.load The version of Tomcat is 8.0.43. For 8.5.x MessageDispatch15Interceptor should be replaced with MessageDispatchInterceptor in server.xonf. Also I've changed the receiver address to 127.0.0.1. wayne pope Please test my change with 7.7.0-SNAPSHOT and let me know if something is still broken! Thank you very much for the discussion and analysis!
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 290398a3e4f950f6c6cd93ad0c557c12c867ce0e in wicket's branch refs/heads/master from Martin Grigorov
          [ https://git-wip-us.apache.org/repos/asf?p=wicket.git;h=290398a ]

          WICKET-6356 Clustering failover not working on Tomcat

          Touch the session attribute every time a new stateful pages are stored so that session replication is triggered

          Show
          jira-bot ASF subversion and git services added a comment - Commit 290398a3e4f950f6c6cd93ad0c557c12c867ce0e in wicket's branch refs/heads/master from Martin Grigorov [ https://git-wip-us.apache.org/repos/asf?p=wicket.git;h=290398a ] WICKET-6356 Clustering failover not working on Tomcat Touch the session attribute every time a new stateful pages are stored so that session replication is triggered
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 77bafbc7de94a4f5a5ce7ed054c6bc5bb693f266 in wicket's branch refs/heads/wicket-6.x from Martin Grigorov
          [ https://git-wip-us.apache.org/repos/asf?p=wicket.git;h=77bafbc ]

          WICKET-6356 Clustering failover not working on Tomcat

          Touch the session attribute every time a new stateful pages are stored so that session replication is triggered

          Show
          jira-bot ASF subversion and git services added a comment - Commit 77bafbc7de94a4f5a5ce7ed054c6bc5bb693f266 in wicket's branch refs/heads/wicket-6.x from Martin Grigorov [ https://git-wip-us.apache.org/repos/asf?p=wicket.git;h=77bafbc ] WICKET-6356 Clustering failover not working on Tomcat Touch the session attribute every time a new stateful pages are stored so that session replication is triggered
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 9338bdffedd1263a48c55a1f3e2c78f11c9d9d6b in wicket's branch refs/heads/wicket-7.x from Martin Grigorov
          [ https://git-wip-us.apache.org/repos/asf?p=wicket.git;h=9338bdf ]

          WICKET-6356 Clustering failover not working on Tomcat

          Touch the session attribute every time a new stateful pages are stored so that session replication is triggered

          Show
          jira-bot ASF subversion and git services added a comment - Commit 9338bdffedd1263a48c55a1f3e2c78f11c9d9d6b in wicket's branch refs/heads/wicket-7.x from Martin Grigorov [ https://git-wip-us.apache.org/repos/asf?p=wicket.git;h=9338bdf ] WICKET-6356 Clustering failover not working on Tomcat Touch the session attribute every time a new stateful pages are stored so that session replication is triggered
          Hide
          mgrigorov Martin Grigorov added a comment -

          I have two questions regarding the attached patch:

          1) Why #removePage() methods are removed ?

          2) why "afterReadObject" is no more transient ?

          Show
          mgrigorov Martin Grigorov added a comment - I have two questions regarding the attached patch: 1) Why #removePage() methods are removed ? 2) why "afterReadObject" is no more transient ?
          Hide
          mgrigorov Martin Grigorov added a comment -

          Thanks, wayne pope!
          I'll take a look in the coming days!

          Show
          mgrigorov Martin Grigorov added a comment - Thanks, wayne pope ! I'll take a look in the coming days!
          Hide
          waynegc wayne pope added a comment -

          I've done a fix which seems to work for me. Please find attached updated PageStoreManager.java based on 7.5.0 code version

          Show
          waynegc wayne pope added a comment - I've done a fix which seems to work for me. Please find attached updated PageStoreManager.java based on 7.5.0 code version

            People

            • Assignee:
              mgrigorov Martin Grigorov
              Reporter:
              waynegc wayne pope
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development