Geronimo
  1. Geronimo
  2. GERONIMO-2577

Geronimo cluster (Tomcat Version)cannot continue the HttpSession when current node is down.

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.1, 1.1.1
    • Fix Version/s: None
    • Component/s: Clustering, Tomcat
    • Security Level: public (Regular issues)
    • Labels:
      None
    • Environment:

      JDK - Sun java version "1_5_0_09"(32bit)
      OS- Red Hat Enterprise Linux ES4 update4(32bit)
      Http Server - Apache 2.0.59 +mod_jk 1.2.19

      Description

      We run Geronimo cluster with three nodes.
      In our environment, with DeltaManager set for replication module, we found that the last node cound not continue the processes when the other nodes is intentionally halted in order.

      We recognize Tomcat 5.5.15 is OK with the same configuration and operations.

      Test Application
      ================
      The Web application program, which was used for the test, simply reads the number of access count, and then write the count to HttpSession object.

      Configuration?Files
      ==================
      We refer http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html

      • config.xml
        We add the following parameters to the standard configuration.

      <gbean name="TomcatEngine">
      <attribute name="initParams">name=Geronimo
      jvmRoute=nodeA</attribute>
      </gbean>

      Operations
      ===============
      1 Have browser access to Test Application , and reload several times.(*1) HttpSession object is created on the nodeA.
      2 And then, We kill the process of geronimo on the nodeA with $kill -9 <Process ID>.(*2)
      3 Reload the browser at one time. The node changes to nodeB.(*3)
      4 Reload the browser several times.(*4)
      5 And then, We kill the process of geronimo on the nodeB with $kill -9 <Process ID>.(*5)
      6 Reload the browser at one time.(And then, We expect that the process continues at the nodeC.)
      But the HttpSessionID of the HttpSession object is changed to another ID and the counter value is back to 1.(*6)
      As a result, the geronimo cluster cannot continue the process.

      For avoidance
      ===============
      When replication module is SimpleTcpReplicationManager, the geronimo cluster can continue the process.

      Debug levels logs
      ==================
      (*1)
      nodeA
      ----------
      20:06:17,736 DEBUG [CoyoteAdapter] Requested cookie session id is 7160C8614D20687D3548E8488AB65AC3.nodeA
      20:06:17,736 DEBUG [JvmRouteBinderValve] Found Cluster DeltaManager org.apache.catalina.cluster.session.DeltaManager@2cb491 at /ClusterCheck
      20:06:17,736 DEBUG [JvmRouteBinderValve] Turnover Check time 0 msec
      20:06:17,737 DEBUG [MsgContext] COMMIT
      20:06:17,737 DEBUG [JkInputStream] COMMIT sending headers org.apache.coyote.Response@118994d === MimeHeaders ===

      20:06:17,737 DEBUG [MsgContext] CLOSE
      20:06:17,738 DEBUG [REQ_TIME] Time pre=0/ service=2 242 /ClusterCheck/counter
      20:06:17,738 DEBUG [ReplicationValve] Invoking replication request on /ClusterCheck/counter
      20:06:17,738 DEBUG [DeltaManager] Manager [/ClusterCheck]: create session message [7160C8614D20687D3548E8488AB65AC3.nodeA] delta request.
      20:06:17,757 DEBUG [McastService] Mcast receive ping from member org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.1.3:4001,catalina,192.168.1.3,4001, alive=58960]
      -----------

      nodeC
      -----------
      20:06:17,655 DEBUG [SimpleTcpCluster] Assuming clocks are synched: Replication for 7160C8614D20687D3548E8488AB65AC3.nodeA-1162811177738 took=-83 ms.
      20:06:17,655 DEBUG [DeltaManager] Manager [/ClusterCheck]: Received SessionMessage of type=(SESSION-DELTA) from [org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.1.1:4001,catalina,192.168.1.1,4001, alive=130441]]
      20:06:17,655 DEBUG [DeltaManager] Manager [/ClusterCheck]: received session [7160C8614D20687D3548E8488AB65AC3.nodeA] delta.
      -----------

      (*2)
      nodeB (same as nodeC)
      -----------
      20:06:39,817 INFO [SimpleTcpCluster] Received member disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.1.1:4001,catalina,192.168.1.1,4001, alive=149288]
      20:06:39,818 DEBUG [MapperListener] Handle geronimo:type=IDataSender,senderAddress=192.168.1.1,senderPort=4001 type : JMX.mbean.unregistered
      20:06:39,818 DEBUG [MapperListener] Handle geronimo:type=IDataSender,senderAddress=192.168.1.1,senderPort=4001 type : JMX.mbean.unregistered
      20:06:39,818 DEBUG [MapperListener] Handle geronimo:type=IDataSender,senderAddress=192.168.1.1,senderPort=4001 type : JMX.mbean.unregistered
      20:06:39,818 DEBUG [DataSender] Sender close socket to [192.168.1.1:4,001] (close count 1)
      20:06:39,818 DEBUG [DataSender] Sender disconnect from [192.168.1.1:4,001] (disconnect count 1)
      ----------

      (*3)
      nodeB
      ---------------
      20:06:40,640 DEBUG [CoyoteAdapter] Requested cookie session id is 7160C8614D20687D3548E8488AB65AC3.nodeA
      20:06:40,641 DEBUG [JvmRouteBinderValve] Found Cluster DeltaManager org.apache.catalina.cluster.session.DeltaManager@16d383a at /ClusterCheck
      20:06:40,641 DEBUG [JvmRouteBinderValve] Detected a failover with different jvmRoute - orginal route: [nodeA] new one: [nodeB] at session id [7160C8614D20687D3548E8488AB65AC3.nodeA]
      20:06:40,641 DEBUG [JvmRouteBinderValve] Found Cluster DeltaManager org.apache.catalina.cluster.session.DeltaManager@16d383a at /ClusterCheck
      20:06:40,642 DEBUG [JvmRouteBinderValve] Setting cookie with session id [7160C8614D20687D3548E8488AB65AC3.nodeB] name: [JSESSIONID] path: [/ClusterCheck] secure: [false]
      20:06:40,643 DEBUG [JvmRouteBinderValve] Set Orginal Session id at request attriute org.apache.catalina.cluster.session.JvmRouteOrignalSessionID value: 7160C8614D20687D3548E8488AB65AC3.nodeA
      20:06:40,648 DEBUG [DataSender] Create sender [/192.168.1.3:4,001]
      20:06:40,650 DEBUG [DataSender] Sender open socket to [192.168.1.3:4,001] (open count 1)
      20:06:40,663 DEBUG [JvmRouteBinderValve] Changed session from [7160C8614D20687D3548E8488AB65AC3.nodeA] to [7160C8614D20687D3548E8488AB65AC3.nodeB]
      --------------

      nodeC
      --------------
      20:06:40,572 DEBUG [SimpleTcpCluster] Assuming clocks are synched: Replication for 7160C8614D20687D3548E8488AB65AC3.nodeA##localhost##/ClusterCheck##0##1162811200571 took=-72 ms.
      20:06:40,580 DEBUG [SimpleTcpCluster] Message org.apache.catalina.cluster.session.SessionIDMessage@128ccdf from type org.apache.catalina.cluster.session.SessionIDMessage transfered but no listener registered
      20:06:40,652 DEBUG [SimpleTcpCluster] Assuming clocks are synched: Replication for 7160C8614D20687D3548E8488AB65AC3.nodeB-1162811200691 took=-39 ms.
      20:06:40,652 DEBUG [DeltaManager] Manager [/ClusterCheck]: Received SessionMessage of type=(SESSION-DELTA) from [org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.1.2:4001,catalina,192.168.1.2,4001, alive=113760]]
      --------------

      (*4)
      nodeB
      ---------------
      20:06:43,677 DEBUG [CoyoteAdapter] Requested cookie session id is 7160C8614D20687D3548E8488AB65AC3.nodeB
      20:06:43,677 DEBUG [JvmRouteBinderValve] Found Cluster DeltaManager org.apache.catalina.cluster.session.DeltaManager@16d383a at /ClusterCheck
      20:06:43,677 DEBUG [JvmRouteBinderValve] Turnover Check time 0 msec
      20:06:43,678 DEBUG [MsgContext] COMMIT
      20:06:43,678 DEBUG [JkInputStream] COMMIT sending headers org.apache.coyote.Response@11e2b21 === MimeHeaders ===

      20:06:43,678 DEBUG [MsgContext] CLOSE
      20:06:43,678 DEBUG [REQ_TIME] Time pre=0/ service=1 242 /ClusterCheck/counter
      20:06:43,679 DEBUG [ReplicationValve] Invoking replication request on /ClusterCheck/counter
      20:06:43,679 DEBUG [DeltaManager] Manager [/ClusterCheck]: create session message [7160C8614D20687D3548E8488AB65AC3.nodeB] delta request.
      20:06:43,721 DEBUG [HandlerRequest] Invoke returned 0
      ---------------

      nodeC
      ---------------
      20:06:43,637 DEBUG [SimpleTcpCluster] Assuming clocks are synched: Replication for 7160C8614D20687D3548E8488AB65AC3.nodeB-1162811203679 took=-42 ms.
      20:06:43,637 DEBUG [DeltaManager] Manager [/ClusterCheck]: Received SessionMessage of type=(SESSION-DELTA) from [org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.1.2:4001,catalina,192.168.1.2,4001, alive=116748]]
      ---------------

      (*5)
      nodeC
      ---------------
      20:07:03,844 INFO [SimpleTcpCluster] Received member disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.1.2:4001,catalina,192.168.1.2,4001, alive=133704]
      ---------------

      (*6)
      nodeC
      ---------------
      20:07:04,950 DEBUG [CoyoteAdapter] Requested cookie session id is 7160C8614D20687D3548E8488AB65AC3.nodeB
      20:07:04,950 DEBUG [StandardWrapper] Allocating non-STM instance
      20:07:04,976 DEBUG [DeltaManager] Created a DeltaSession with Id [E5F29B4F1C7F849618B4442076B84DB1.nodeC] Total count=2
      20:07:04,983 DEBUG [MsgContext] COMMIT
      20:07:04,985 DEBUG [JkInputStream] COMMIT sending headers org.apache.coyote.Response@1222045 === MimeHeaders ===
      Set-Cookie = JSESSIONID=E5F29B4F1C7F849618B4442076B84DB1.nodeC; Path=/ClusterCheck

      20:07:04,986 DEBUG [MsgContext] CLOSE
      ---------------

      1. appClustering_server.xml
        18 kB
        Shiva Kumar H R
      2. appClustering_context.xml
        2 kB
        Shiva Kumar H R
      3. geronimo-web-hostlevel.xml
        1 kB
        Shiva Kumar H R
      4. new-config.xml
        9 kB
        Shiva Kumar H R
      5. config.xml
        13 kB
        Jeff Genender
      6. geronimo-new.log
        17 kB
        Kaoru Matsumura
      7. geronimo-web-new.xml
        4 kB
        Jeff Genender
      8. context.xml
        0.3 kB
        Kaoru Matsumura
      9. server-nodeC.xml
        22 kB
        Kaoru Matsumura
      10. server-nodeB.xml
        22 kB
        Kaoru Matsumura
      11. server-nodeA.xml
        22 kB
        Kaoru Matsumura
      12. geronimo-web-nodeC.xml
        4 kB
        Kaoru Matsumura
      13. geronimo-web-nodeB.xml
        4 kB
        Kaoru Matsumura
      14. geronimo-web.xml
        4 kB
        Kaoru Matsumura

        Activity

        Hide
        Kaoru Matsumura added a comment -

        Here is the geronimo-web.xml using this test.

        Show
        Kaoru Matsumura added a comment - Here is the geronimo-web.xml using this test.
        Hide
        Dave Colasurdo added a comment -

        1) Is node c (the failing node) always the same physical machine?

        2) Are you 100% certain that all three nodes are on the same subnet? This is extremely important..
        Applying the subnet mask to each nodes IP address should yield the exact same network portion of the IP address.

        3) Have you added the <distributable> tag to the web.xml for each clustered app?

        4) Can you please elaborate on the following statement ..."When replication module is SimpleTcpReplicationManager, the geronimo cluster can continue the process." What exactly does that mean? What replication module are you using?

        Thanks

        Show
        Dave Colasurdo added a comment - 1) Is node c (the failing node) always the same physical machine? 2) Are you 100% certain that all three nodes are on the same subnet? This is extremely important.. Applying the subnet mask to each nodes IP address should yield the exact same network portion of the IP address. 3) Have you added the <distributable> tag to the web.xml for each clustered app? 4) Can you please elaborate on the following statement ..."When replication module is SimpleTcpReplicationManager, the geronimo cluster can continue the process." What exactly does that mean? What replication module are you using? Thanks
        Hide
        Dave Colasurdo added a comment -

        I have seen clustering fail in multicast mode when the physical machines were not on the same subnet...
        Even when you plug machines into adjacent wallports, oftentimes the subnets are different. That would be my first guess..

        Lots more questions

        5) I've heard that this same scenario works for you when using Tomcat w/o Geronimo. Did you run the tomcat test
        on the same exact same three machines as the geronimo test?
        6) Are you certain that config.xml has a unique nodename (i.e jvmRoute)for each of the three nodes?
        7) Are you certain that the IP addresses are unique (and correct) in the deployment plan for each node?
        8) Can you provide all three deployment plans (one for each node)?
        9) Any particular reason you set useDirtyFlag=true?
        10) I see that you removed the mcastBindAddress setting in the deployment plan. I've heard differing
        information on whether or not this is needed. Have you tried setting this field in each of the nodes?
        11) Have you tried removing waitForAck=true and using ackTimeout for the test?

        Thanks
        Dave

        Show
        Dave Colasurdo added a comment - I have seen clustering fail in multicast mode when the physical machines were not on the same subnet... Even when you plug machines into adjacent wallports, oftentimes the subnets are different. That would be my first guess.. Lots more questions 5) I've heard that this same scenario works for you when using Tomcat w/o Geronimo. Did you run the tomcat test on the same exact same three machines as the geronimo test? 6) Are you certain that config.xml has a unique nodename (i.e jvmRoute)for each of the three nodes? 7) Are you certain that the IP addresses are unique (and correct) in the deployment plan for each node? 8) Can you provide all three deployment plans (one for each node)? 9) Any particular reason you set useDirtyFlag=true? 10) I see that you removed the mcastBindAddress setting in the deployment plan. I've heard differing information on whether or not this is needed. Have you tried setting this field in each of the nodes? 11) Have you tried removing waitForAck=true and using ackTimeout for the test? Thanks Dave
        Hide
        Kaoru Matsumura added a comment -

        Thank you for reply.

        1) No. This problem doesn't depend on nodes.
        2) Yes.
        3) Yes.
        4) This means that when we changed DeltaManager to SimpleTcpReplicationManager,
        the process can continue at the nodeC(the last node).
        So we avoid this problem using SimpleTcpReplicationManager.

        21th lines of the geronimo-web.xml
        ---------------------------
        managerClassName=org.apache.catalina.cluster.session.DeltaManager
        ?
        ? managerClassName=org.apache.catalina.cluster.session. SimpleTcpReplicationManager
        ---------------------------

        In the case of Tomcat, This problem doesn't happen , even if using DeltaManager.

        5) Yes.
        6) Yes.
        7) Yes.
        8) OK.
        9) Whether useDirtyFlag is true or false, the results were the same.
        So we set useDirtyFlag to true.
        10) When mcastBindAddress is set on Linux, it seems that multicast packet can not be accepted.
        So I removed mcastBindAddress from all of the nodes.
        This matter is the same as Tomcat and Geronimo.
        11) No,I haven't. So I tried it. But the results were the same.

        There is an interest for me,

        At the point of (*1), nodeC's log in Description this probrem report shows,

        20:06:17,655 DEBUG [DeltaManager] Manager [/ClusterCheck]: Received SessionMessage of type=(SESSION-DELTA) from [org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.1.1:4001,catalina,192.168.1.1,4001, alive=130441]]
        20:06:17,655 DEBUG [DeltaManager] Manager [/ClusterCheck]: received session [7160C8614D20687D3548E8488AB65AC3.nodeA] delta.

        However, at the point of (*3), nodeC's log shows,

        20:06:40,652 DEBUG [DeltaManager] Manager [/ClusterCheck]: Received SessionMessage of type=(SESSION-DELTA) from [org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.1.2:4001,catalina,192.168.1.2,4001, alive=113760]]

        These are different. the case of 2nd, the line of 'received session' was not printed in the log after this time.
        I think this is related.

        thanks

        Show
        Kaoru Matsumura added a comment - Thank you for reply. 1) No. This problem doesn't depend on nodes. 2) Yes. 3) Yes. 4) This means that when we changed DeltaManager to SimpleTcpReplicationManager, the process can continue at the nodeC(the last node). So we avoid this problem using SimpleTcpReplicationManager. 21th lines of the geronimo-web.xml --------------------------- managerClassName=org.apache.catalina.cluster.session.DeltaManager ? ? managerClassName=org.apache.catalina.cluster.session. SimpleTcpReplicationManager --------------------------- In the case of Tomcat, This problem doesn't happen , even if using DeltaManager. 5) Yes. 6) Yes. 7) Yes. 8) OK. 9) Whether useDirtyFlag is true or false, the results were the same. So we set useDirtyFlag to true. 10) When mcastBindAddress is set on Linux, it seems that multicast packet can not be accepted. So I removed mcastBindAddress from all of the nodes. This matter is the same as Tomcat and Geronimo. 11) No,I haven't. So I tried it. But the results were the same. There is an interest for me, At the point of (*1), nodeC's log in Description this probrem report shows, 20:06:17,655 DEBUG [DeltaManager] Manager [/ClusterCheck] : Received SessionMessage of type=(SESSION-DELTA) from [org.apache.catalina.cluster.mcast.McastMember [tcp://192.168.1.1:4001,catalina,192.168.1.1,4001, alive=130441] ] 20:06:17,655 DEBUG [DeltaManager] Manager [/ClusterCheck] : received session [7160C8614D20687D3548E8488AB65AC3.nodeA] delta. However, at the point of (*3), nodeC's log shows, 20:06:40,652 DEBUG [DeltaManager] Manager [/ClusterCheck] : Received SessionMessage of type=(SESSION-DELTA) from [org.apache.catalina.cluster.mcast.McastMember [tcp://192.168.1.2:4001,catalina,192.168.1.2,4001, alive=113760] ] These are different. the case of 2nd, the line of 'received session' was not printed in the log after this time. I think this is related. thanks
        Hide
        Kaoru Matsumura added a comment -

        Here is the geronimo-web.xml using nodeB

        Show
        Kaoru Matsumura added a comment - Here is the geronimo-web.xml using nodeB
        Hide
        Kaoru Matsumura added a comment -

        Here is the geronimo-web.xml using nodeC

        Show
        Kaoru Matsumura added a comment - Here is the geronimo-web.xml using nodeC
        Hide
        Kevan Miller added a comment -

        Kaoru,
        Thanks for all the great info. Looking at the logs from your jira, it looks like Tomcat is not being properly configured for your scenario.

        From (*3) NodeC:

        20:06:40,580 DEBUG [SimpleTcpCluster] Message org.apache.catalina.cluster.session.SessionIDMessage@128ccdf from type org.apache.catalina.cluster.session.SessionIDMessage transfered but no listener registered

        This indicates a clusterListener is not being configured to Tomcat.

        I'm not terribly familiar with Tomcat clustering. However, it seems that the clustering example at http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html may work for a two-node cluster, but not a three(or more)-node cluster.

        An org.apache.catalina.cluster.session.JvmRouteSessionIDBinderListener needs to be configured as a clusterListener to org.apache.catalina.cluster.tcp.SimpleTcpCluster.

        I see the following dev list post describes the necessary configuration of the clusterListener – http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html. However, I'm not sure that it's up-to-date. Perhaps somebody more familiar with Geronimo clustering configuration can comment...

        Show
        Kevan Miller added a comment - Kaoru, Thanks for all the great info. Looking at the logs from your jira, it looks like Tomcat is not being properly configured for your scenario. From (*3) NodeC: 20:06:40,580 DEBUG [SimpleTcpCluster] Message org.apache.catalina.cluster.session.SessionIDMessage@128ccdf from type org.apache.catalina.cluster.session.SessionIDMessage transfered but no listener registered This indicates a clusterListener is not being configured to Tomcat. I'm not terribly familiar with Tomcat clustering. However, it seems that the clustering example at http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html may work for a two-node cluster, but not a three(or more)-node cluster. An org.apache.catalina.cluster.session.JvmRouteSessionIDBinderListener needs to be configured as a clusterListener to org.apache.catalina.cluster.tcp.SimpleTcpCluster. I see the following dev list post describes the necessary configuration of the clusterListener – http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html . However, I'm not sure that it's up-to-date. Perhaps somebody more familiar with Geronimo clustering configuration can comment...
        Hide
        Jeff Genender added a comment -

        I think this is the problem:

        I noticed the ClusterSessionListener and JvmRouteSessionIDBinderListener Gbeans are declared, but the listener is not being plugged into anything...i.e. there is nothing referencing these GBeans. Therefore, they are being ignored.

        You probably need the following:

        Add the following reference to your TomcatCluster GBean:

        <reference name="messageListenerChain"> <name>ClusterSessionListener</name> </reference>

        Then you need to add a reference to the JvmRouteSessionIDBinderListener in the ClusterSessionListener. So add the following to the ClusterSessionListener Gbean:

        <reference name="nextListener"><name>JvmRouteSessionIDBinderListener</name></reference>

        This should get the clustering to work.

        If it does not, could you please also post your Tomcat configuration files (server.xml and context.xml) so I can juxtapose them and see if anything is missing?

        Show
        Jeff Genender added a comment - I think this is the problem: I noticed the ClusterSessionListener and JvmRouteSessionIDBinderListener Gbeans are declared, but the listener is not being plugged into anything...i.e. there is nothing referencing these GBeans. Therefore, they are being ignored. You probably need the following: Add the following reference to your TomcatCluster GBean: <reference name="messageListenerChain"> <name>ClusterSessionListener</name> </reference> Then you need to add a reference to the JvmRouteSessionIDBinderListener in the ClusterSessionListener. So add the following to the ClusterSessionListener Gbean: <reference name="nextListener"><name>JvmRouteSessionIDBinderListener</name></reference> This should get the clustering to work. If it does not, could you please also post your Tomcat configuration files (server.xml and context.xml) so I can juxtapose them and see if anything is missing?
        Hide
        Kaoru Matsumura added a comment -

        Thank you for reply.

        I added the following line to my geronimo-web.xml's 31st line.
        <reference name="MessageListenerChain"> <name>ClusterSessionListener</name> </reference>

        also added the following to 93rd line.
        <reference name="nextListener"><name>JvmRouteSessionIDBinderListener</name></reference>

        But the results were the same.

        So I post my Tomcat5.5.15 configuration files (server.xml and context.xml).

        context.xml is not editted anything from default in each node.

        Thanks

        Show
        Kaoru Matsumura added a comment - Thank you for reply. I added the following line to my geronimo-web.xml's 31st line. <reference name="MessageListenerChain"> <name>ClusterSessionListener</name> </reference> also added the following to 93rd line. <reference name="nextListener"><name>JvmRouteSessionIDBinderListener</name></reference> But the results were the same. So I post my Tomcat5.5.15 configuration files (server.xml and context.xml). context.xml is not editted anything from default in each node. Thanks
        Hide
        Kaoru Matsumura added a comment -

        Here is the Tomcat5.5.15 server-.xml using nodeA

        Show
        Kaoru Matsumura added a comment - Here is the Tomcat5.5.15 server-.xml using nodeA
        Hide
        Kaoru Matsumura added a comment -

        Here is the Tomcat5.5.15 server-.xml using nodeB

        Show
        Kaoru Matsumura added a comment - Here is the Tomcat5.5.15 server-.xml using nodeB
        Hide
        Kaoru Matsumura added a comment -

        Here is the Tomcat5.5.15 server-.xml using nodeC

        Show
        Kaoru Matsumura added a comment - Here is the Tomcat5.5.15 server-.xml using nodeC
        Hide
        Kaoru Matsumura added a comment -

        Here is the Tomcat5.5.15 context.xml using every node

        Show
        Kaoru Matsumura added a comment - Here is the Tomcat5.5.15 context.xml using every node
        Hide
        Jeff Genender added a comment -

        Kaoru,

        My fault on the config...please use "NextListener" instead of "nextListener". It is case sensitive.

        Show
        Jeff Genender added a comment - Kaoru, My fault on the config...please use "NextListener" instead of "nextListener". It is case sensitive.
        Hide
        Jeff Genender added a comment -

        The only other difference I am seeing here also is that you are applying clustering at the host level in Tomcat and at the context (web app) level in Geronimo. Lets see if the "NextListener" works...and if not we can look at making the clustering the same levels on G and TC.

        Show
        Jeff Genender added a comment - The only other difference I am seeing here also is that you are applying clustering at the host level in Tomcat and at the context (web app) level in Geronimo. Lets see if the "NextListener" works...and if not we can look at making the clustering the same levels on G and TC.
        Hide
        Kaoru Matsumura added a comment -

        Thanks, Jeff

        As I see, in our environment,
        "NextListener" is being used, not "nextListener".

        Thanks

        Show
        Kaoru Matsumura added a comment - Thanks, Jeff As I see, in our environment, "NextListener" is being used, not "nextListener". Thanks
        Hide
        Kaoru Matsumura added a comment -

        Jeff,

        I found the following log messages regarding to NextListener in the start of geronimo cluster.
        The message "no targets are running for reference NextListener matching ...." is right ?

        nodeA
        ---------
        2006-12-25-11-10-29-691 DEBUG [GBeanSingleReference] Waiting to start geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=ClusterSessionListener because no targets are running for reference NextListener matching the patterns geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=JvmRouteSessionIDBinderListener
        ---------

        The same messages as nodeB or nodeC.

        "ClusterCheck" is our test Application.

        Show
        Kaoru Matsumura added a comment - Jeff, I found the following log messages regarding to NextListener in the start of geronimo cluster. The message "no targets are running for reference NextListener matching ...." is right ? nodeA --------- 2006-12-25-11-10-29-691 DEBUG [GBeanSingleReference] Waiting to start geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=ClusterSessionListener because no targets are running for reference NextListener matching the patterns geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=JvmRouteSessionIDBinderListener --------- The same messages as nodeB or nodeC. "ClusterCheck" is our test Application.
        Hide
        Jeff Genender added a comment -

        I am able to get the listeners running fine for the geronimo-web-new.xml file that I have attached. You will need to edit line 54 with your IP address for each machine.

        Please notice the log below.


        07:10:18,727 DEBUG [GBeanInstanceState] GBeanInstanceState for: geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=JvmRouteSessionIDBinderListener State changed from stopped to starting
        07:10:18,735 DEBUG [MessageListenerGBean] org.apache.catalina.cluster.session.JvmRouteSessionIDBinderListener started.
        07:10:18,735 DEBUG [GBeanInstanceState] GBeanInstanceState for: geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=JvmRouteSessionIDBinderListener State changed from starting to running
        07:10:18,747 DEBUG [GBeanSingleReference] Started geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=ClusterSessionListener
        07:10:18,753 DEBUG [MessageListenerGBean] org.apache.catalina.cluster.session.ClusterSessionListener started.
        07:10:18,753 DEBUG [GBeanInstanceState] GBeanInstanceState for: geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=ClusterSessionListener State changed from starting to running
        07:10:18,754 DEBUG [GBeanSingleReference] Started geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=Cluster,name=TomcatCluster


        Please give this a try.

        Show
        Jeff Genender added a comment - I am able to get the listeners running fine for the geronimo-web-new.xml file that I have attached. You will need to edit line 54 with your IP address for each machine. Please notice the log below. 07:10:18,727 DEBUG [GBeanInstanceState] GBeanInstanceState for: geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=JvmRouteSessionIDBinderListener State changed from stopped to starting 07:10:18,735 DEBUG [MessageListenerGBean] org.apache.catalina.cluster.session.JvmRouteSessionIDBinderListener started. 07:10:18,735 DEBUG [GBeanInstanceState] GBeanInstanceState for: geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=JvmRouteSessionIDBinderListener State changed from starting to running 07:10:18,747 DEBUG [GBeanSingleReference] Started geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=ClusterSessionListener 07:10:18,753 DEBUG [MessageListenerGBean] org.apache.catalina.cluster.session.ClusterSessionListener started. 07:10:18,753 DEBUG [GBeanInstanceState] GBeanInstanceState for: geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=MessageListener,name=ClusterSessionListener State changed from starting to running 07:10:18,754 DEBUG [GBeanSingleReference] Started geronimo/ClusterCheck/1.1/war?J2EEApplication=null,WebModule=geronimo/ClusterCheck/1.1/war,j2eeType=Cluster,name=TomcatCluster Please give this a try.
        Hide
        Kaoru Matsumura added a comment -

        Thank you for advice.

        I tried your geronimo-web-new.xml.
        I observed the same log message as you noticed.
        But the results of the test were the same.

        Show
        Kaoru Matsumura added a comment - Thank you for advice. I tried your geronimo-web-new.xml. I observed the same log message as you noticed. But the results of the test were the same.
        Hide
        Kaoru Matsumura added a comment -

        Adding information

        I have attached the whole geronimo.log(except [McastService] ping) of the last remaining node.
        in using your geronimo-web-new.xml.

        15:20:40 first node was down.
        15:22:50 second node was down.
        15:23:07 new sessionid was created.

        Show
        Kaoru Matsumura added a comment - Adding information I have attached the whole geronimo.log(except [McastService] ping) of the last remaining node. in using your geronimo-web-new.xml. 15:20:40 first node was down. 15:22:50 second node was down. 15:23:07 new sessionid was created.
        Hide
        Jeff Genender added a comment -

        Ok...lets first make this Apples to Apples. Tomcat is being configured at the Host level and you are configuring Geronimo at the Context level. I have attached a config.xml with the necessary changes required to turn on Host level clustering. Please change line 162 to match your IP address and line 71 to reflect your node. Also please remove all clustering from your geronimo-web.xml since we will now be clustering at the host.

        Show
        Jeff Genender added a comment - Ok...lets first make this Apples to Apples. Tomcat is being configured at the Host level and you are configuring Geronimo at the Context level. I have attached a config.xml with the necessary changes required to turn on Host level clustering. Please change line 162 to match your IP address and line 71 to reflect your node. Also please remove all clustering from your geronimo-web.xml since we will now be clustering at the host.
        Hide
        Kaoru Matsumura added a comment -

        OK . I'll try it.

        Thanks

        Show
        Kaoru Matsumura added a comment - OK . I'll try it. Thanks
        Hide
        Shiva Kumar H R added a comment -

        Hello Kaoru,
        I started looking into this and could reproduce the problem even on a cluster consisting of 3 Windows XP machines. The problem is happening when Context level clustering is used (with both "geronimo-web.xml" and "geronimo-web-new.xml").

        Tried Host level clustering using the "config.xml" posted by Jeff. Since this "config.xml" has a lot of configurations that are not there in the default setup, I have edited a fresh "config.xml" from "geronimo-tomcat-1.1.1" setup and included the additions by Jeff. The resulting "new-config.xml" is attached to the JIRA. In addition "geronimo-web-hostlevel.xml" is used at all the 3 nodes.

        This time Clustering works superbly. Session gets preserved successfully even onto the 3rd remaining node.

        Here are the steps I have tried:
        1) Start all 3 servers: Server-1, Server-2 & Server-3.
        2) Deploy the application on all three servers and access the client. Client was being served by Server-2.
        3) Kill Server-2 and refresh the client. Client now being served by Server-3 with session id & state preserved.
        4) Kill Server-3 and refresh the client. Client is now being served by Server-1 with session id & state preserved.
        Also tried these:
        5) Restart Server-2 and kill Server-1. Client is now being served by Server-2 with session id & state preserved.
        6) Restart Server-3 and kill Server-2. Client is now being served by Server-3 with session id & state preserved.

        So failover happening successfully across the 3-node cluster (when Host Level Clustering is used).

        Please try this on your RHEL Cluster using "new-config.xml" and "geronimo-web-hostlevel.xml".

        Note that the attached "new-config.xml" is for a "geronimo-tomcat-1.1.1" setup. Line 162 needs to be edited with IP-address for each machine. Since I am using the sample appication from http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html which has security enabled, "geronimo-web-hostlevel.xml" posted has security-configurations. Comment/Delete lines 15 to 25 if your application doesn't need them.

        Thx,
        Shiva Kumar

        Show
        Shiva Kumar H R added a comment - Hello Kaoru, I started looking into this and could reproduce the problem even on a cluster consisting of 3 Windows XP machines. The problem is happening when Context level clustering is used (with both "geronimo-web.xml" and "geronimo-web-new.xml"). Tried Host level clustering using the "config.xml" posted by Jeff. Since this "config.xml" has a lot of configurations that are not there in the default setup, I have edited a fresh "config.xml" from "geronimo-tomcat-1.1.1" setup and included the additions by Jeff. The resulting "new-config.xml" is attached to the JIRA. In addition "geronimo-web-hostlevel.xml" is used at all the 3 nodes. This time Clustering works superbly. Session gets preserved successfully even onto the 3rd remaining node. Here are the steps I have tried: 1) Start all 3 servers: Server-1, Server-2 & Server-3. 2) Deploy the application on all three servers and access the client. Client was being served by Server-2. 3) Kill Server-2 and refresh the client. Client now being served by Server-3 with session id & state preserved. 4) Kill Server-3 and refresh the client. Client is now being served by Server-1 with session id & state preserved. Also tried these: 5) Restart Server-2 and kill Server-1. Client is now being served by Server-2 with session id & state preserved. 6) Restart Server-3 and kill Server-2. Client is now being served by Server-3 with session id & state preserved. So failover happening successfully across the 3-node cluster (when Host Level Clustering is used). Please try this on your RHEL Cluster using "new-config.xml" and "geronimo-web-hostlevel.xml". Note that the attached "new-config.xml" is for a "geronimo-tomcat-1.1.1" setup. Line 162 needs to be edited with IP-address for each machine. Since I am using the sample appication from http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html which has security enabled, "geronimo-web-hostlevel.xml" posted has security-configurations. Comment/Delete lines 15 to 25 if your application doesn't need them. Thx, Shiva Kumar
        Hide
        Shiva Kumar H R added a comment -

        Also edit line 70 of "new-config.xml"

        Thx,
        Shiva

        Show
        Shiva Kumar H R added a comment - Also edit line 70 of "new-config.xml" Thx, Shiva
        Hide
        Kaoru Matsumura added a comment -

        Hello Jeff and Shiva, I'm sorry to be late for reply.

        I tried your new-config.xml and geronimo-web-hostlevel.xml(Host level Clustering)on my RHEL environment.
        Good! The clustering works perfectly. The same as your case , session gets preserved successfully even onto the 3rd remaining node.

        Would you please tell me why clustering doesn't work successfully (Context level Clustering + DeltaManager) ?

        Thanks

        Show
        Kaoru Matsumura added a comment - Hello Jeff and Shiva, I'm sorry to be late for reply. I tried your new-config.xml and geronimo-web-hostlevel.xml(Host level Clustering)on my RHEL environment. Good! The clustering works perfectly. The same as your case , session gets preserved successfully even onto the 3rd remaining node. Would you please tell me why clustering doesn't work successfully (Context level Clustering + DeltaManager) ? Thanks
        Hide
        Shiva Kumar H R added a comment -

        Kaoru,
        Thanks for testing this.

        Looks like Context level Clustering is a not yet supported feature or a bug in Tomcat itself. I tested Context/Application level clustering in Tomcat 5.5.20 on my 3-node Win-XP cluster and the testing fails.

        I used the following configuration files for enabling Context level clustering in Tomcat:
        1) "appClustering_context.xml" (edit lines 12 and 25 of this file in case you want to give it a try. copy this file into the "conf" folder of Tomcat and rename it as "config.xml"
        2) "appClustering_server.xml" (copy this file to Tomcat's "conf" folder and rename it as "server.xml")

        Upon deploying the sample application in http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html and running http://Yourhost/servlets-examples-cluster/servlet/SessionExample, I observe that session-ids are getting changed upon hitting page-refreshes or upon adding attributes. I tried this app level clustering in Tomcat multiple times, but same results every time, with even sticky sessions not being in effect.

        Jeff,
        How can we confirm if it is a Bug OR not supported Feature in Tomcat?

        If it turns out to be a not supported feature in Tomcat, then shouldn't we claim that "Geronimo supports Clustering at the Host or Engine level only" and accordingly update the tutorial in http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html. Please suggest.

        Thx,
        Shiva

        Show
        Shiva Kumar H R added a comment - Kaoru, Thanks for testing this. Looks like Context level Clustering is a not yet supported feature or a bug in Tomcat itself. I tested Context/Application level clustering in Tomcat 5.5.20 on my 3-node Win-XP cluster and the testing fails. I used the following configuration files for enabling Context level clustering in Tomcat: 1) "appClustering_context.xml" (edit lines 12 and 25 of this file in case you want to give it a try. copy this file into the "conf" folder of Tomcat and rename it as "config.xml" 2) "appClustering_server.xml" (copy this file to Tomcat's "conf" folder and rename it as "server.xml") Upon deploying the sample application in http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html and running http://Yourhost/servlets-examples-cluster/servlet/SessionExample , I observe that session-ids are getting changed upon hitting page-refreshes or upon adding attributes. I tried this app level clustering in Tomcat multiple times, but same results every time, with even sticky sessions not being in effect. Jeff, How can we confirm if it is a Bug OR not supported Feature in Tomcat? If it turns out to be a not supported feature in Tomcat, then shouldn't we claim that "Geronimo supports Clustering at the Host or Engine level only" and accordingly update the tutorial in http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html . Please suggest. Thx, Shiva
        Hide
        Shiva Kumar H R added a comment -

        Opened a bug in Tomcat Bugzilla for context level clustering problem:
        "Context level clustering on 3 or more nodes fails in Tomcat 5.5.20 http://issues.apache.org/bugzilla/show_bug.cgi?id=41620"

        Created a separate article on "Geronimo Tomcat Host Level Clustering" http://cwiki.apache.org/GMOxDOC11/geronimo-tomcat-context-level-clustering-sample-application.html

        And updated the current clustering article http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html with the following warning:
        "Context level" clustering has a known bug in Tomcat as reported in https://issues.apache.org/jira/browse/GERONIMO-2577 and http://issues.apache.org/bugzilla/show_bug.cgi?id=41620. Hence we recommend that you use "Host level" clustering as documented in http://cwiki.apache.org/GMOxDOC11/geronimo-tomcat-host-level-clustering-sample-application.html.

        • Shiva
        Show
        Shiva Kumar H R added a comment - Opened a bug in Tomcat Bugzilla for context level clustering problem: "Context level clustering on 3 or more nodes fails in Tomcat 5.5.20 http://issues.apache.org/bugzilla/show_bug.cgi?id=41620 " Created a separate article on "Geronimo Tomcat Host Level Clustering" http://cwiki.apache.org/GMOxDOC11/geronimo-tomcat-context-level-clustering-sample-application.html And updated the current clustering article http://cwiki.apache.org/GMOxDOC11/clustering-sample-application.html with the following warning: "Context level" clustering has a known bug in Tomcat as reported in https://issues.apache.org/jira/browse/GERONIMO-2577 and http://issues.apache.org/bugzilla/show_bug.cgi?id=41620 . Hence we recommend that you use "Host level" clustering as documented in http://cwiki.apache.org/GMOxDOC11/geronimo-tomcat-host-level-clustering-sample-application.html . Shiva
        Hide
        Shiva Kumar H R added a comment -

        Tomcat bug http://issues.apache.org/bugzilla/show_bug.cgi?id=41620 closed as "Resolved Invalid" with the following comments:

        ------- Additional Comment #7 From Mark Thomas 2007-02-15 18:54 [reply] -------
        It is not possible to configure clustering in context.xml. It must be done at the Host level (with the jvmRoute defined at the Engine level) within server.xml

        No discussing on AG dev list as to whether we should claim that "Geronimo (Tomcat version) supports Clustering at the Host or Engine level only".

        Show
        Shiva Kumar H R added a comment - Tomcat bug http://issues.apache.org/bugzilla/show_bug.cgi?id=41620 closed as "Resolved Invalid" with the following comments: ------- Additional Comment #7 From Mark Thomas 2007-02-15 18:54 [reply] ------- It is not possible to configure clustering in context.xml. It must be done at the Host level (with the jvmRoute defined at the Engine level) within server.xml No discussing on AG dev list as to whether we should claim that "Geronimo (Tomcat version) supports Clustering at the Host or Engine level only".
        Hide
        Shiva Kumar H R added a comment -

        Read the last line in above comment as "Now discussing on AG dev list ..."

        Show
        Shiva Kumar H R added a comment - Read the last line in above comment as "Now discussing on AG dev list ..."

          People

          • Assignee:
            Unassigned
            Reporter:
            Kaoru Matsumura
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development