Bug 34260 - Bring up a dead node, the node will not get session data update from others.
Summary: Bring up a dead node, the node will not get session data update from others.
Status: RESOLVED INVALID
Alias: None
Product: Tomcat 5
Classification: Unclassified
Component: Catalina:Cluster (show other bugs)
Version: 5.0.28
Hardware: Other Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-04-01 03:04 UTC by Hang Zhao
Modified: 2005-04-01 11:37 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Hang Zhao 2005-04-01 03:04:05 UTC
A tomcat cluster, config as
---------------------------------------------------------------------------
<Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"
                managerClassName="org.apache.catalina.cluster.session.DeltaManager"
                expireSessionsOnShutdown="false"
                useDirtyFlag="true">
-------------------------------------------------------------------------
kill one node, then bring it back up again, it will not receive session data
update from other node. Following is what I see after bring back the dead node

----------------------------------------------------------------------------
Created MBeanServer with ID: 18020cc:102f5dc8465:-8000:donau:1
Mar 30, 2005 3:47:13 PM org.apache.coyote.http11.Http11Protocol init
INFO: Initializing Coyote HTTP/1.1 on http-8080
Mar 30, 2005 3:47:13 PM org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 1273 ms
Mar 30, 2005 3:47:13 PM org.apache.catalina.core.StandardService start
INFO: Starting service Catalina
Mar 30, 2005 3:47:13 PM org.apache.catalina.core.StandardEngine start
INFO: Starting Servlet Engine: Apache Tomcat/5.0
Mar 30, 2005 3:47:13 PM org.apache.catalina.core.StandardHost start
INFO: XML validation disabled
Mar 30, 2005 3:47:13 PM org.apache.catalina.cluster.tcp.SimpleTcpCluster start
INFO: Cluster is about to start
Mar 30, 2005 3:47:13 PM org.apache.catalina.cluster.mcast.McastService start
INFO: Sleeping for 2000 secs to establish cluster membership
Mar 30, 2005 3:47:14 PM org.apache.catalina.cluster.tcp.SimpleTcpCluster memberAdded
INFO: Replication member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://xxx.xx.20.218:4002,xxx.xx.20.218,4002,
alive=10907852]
Mar 30, 2005 3:47:14 PM org.apache.catalina.cluster.tcp.SimpleTcpCluster memberAdded
INFO: Replication member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://127.0.0.1:4002,127.0.0.1,4002,
alive=92927450]
Mar 30, 2005 3:47:16 PM org.apache.catalina.core.StandardHost getDeployer
INFO: Create Host deployer for direct deployment ( non-jmx )
Mar 30, 2005 3:47:16 PM org.apache.catalina.core.StandardHostDeployer install
INFO: Processing Context configuration file URL
file:/opt/dev/share/jakarta/tomcat/base/conf/Catalina/localhost/balancer.xml
Mar 30, 2005 3:47:16 PM org.apache.catalina.core.StandardHostDeployer install
INFO: Processing Context configuration file URL
file:/opt/dev/share/jakarta/tomcat/base/conf/Catalina/localhost/manager.xml
Mar 30, 2005 3:47:17 PM org.apache.catalina.core.StandardHostDeployer install
INFO: Processing Context configuration file URL
file:/opt/dev/share/jakarta/tomcat/base/conf/Catalina/localhost/admin.xml
Mar 30, 2005 3:47:17 PM org.apache.struts.util.PropertyMessageResources <init>
INFO: Initializing, config='org.apache.struts.util.LocalStrings', returnNull=true
Mar 30, 2005 3:47:17 PM org.apache.struts.util.PropertyMessageResources <init>
INFO: Initializing, config='org.apache.struts.action.ActionResources',
returnNull=true
Mar 30, 2005 3:47:17 PM org.apache.struts.util.PropertyMessageResources <init>
INFO: Initializing, config='org.apache.webapp.admin.ApplicationResources',
returnNull=true
Mar 30, 2005 3:47:20 PM org.apache.catalina.core.StandardHostDeployer install
INFO: Installing web application at context path /tomcat-docs from URL
file:/opt/dev/share/jakarta/tomcat/base/webapps/tomcat-docs
Mar 30, 2005 3:47:20 PM org.apache.catalina.core.StandardHostDeployer install
INFO: Installing web application at context path /jsp-examples from URL
file:/opt/dev/share/jakarta/tomcat/base/webapps/jsp-examples
Mar 30, 2005 3:47:20 PM org.apache.catalina.core.StandardHostDeployer install
INFO: Installing web application at context path  from URL
file:/opt/dev/share/jakarta/tomcat/base/webapps/ROOT
Mar 30, 2005 3:47:20 PM org.apache.catalina.core.StandardHostDeployer install
INFO: Installing web application at context path /webdav from URL
file:/opt/dev/share/jakarta/tomcat/base/webapps/webdav
Mar 30, 2005 3:47:20 PM org.apache.catalina.core.StandardHostDeployer install
INFO: Installing web application at context path /clusterapp from URL
file:/opt/dev/share/jakarta/tomcat/base/webapps/clusterapp




Creating ClusterManager for context /clusterapp using class
org.apache.catalina.cluster.session.DeltaManager




Mar 30, 2005 3:47:20 PM org.apache.catalina.cluster.session.DeltaManager start
INFO: Starting clustering manager...:/clusterapp
Mar 30, 2005 3:47:20 PM org.apache.catalina.cluster.tcp.ReplicationTransmitter
sendMessageData
WARNING: Unable to send replicated message, is server down?
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:305)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:171)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:158)
        at java.net.Socket.connect(Socket.java:452)
        at java.net.Socket.connect(Socket.java:402)
        at java.net.Socket.<init>(Socket.java:309)
        at java.net.Socket.<init>(Socket.java:153)
        at
org.apache.catalina.cluster.tcp.SocketSender.connect(SocketSender.java:66)
        at
org.apache.catalina.cluster.tcp.SocketSender.sendMessage(SocketSender.java:112)
        at
org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage(PooledSocketSender.java:119)
        at
org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(ReplicationTransmitter.java:117)
        at
org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:136)
        at
org.apache.catalina.cluster.tcp.SimpleTcpCluster.send(SimpleTcpCluster.java:457)
        at
org.apache.catalina.cluster.session.DeltaManager.start(DeltaManager.java:648)
        at org.apache.catalina.core.ContainerBase.setManager(ContainerBase.java:499)
        at
org.apache.catalina.startup.ContextConfig.managerConfig(ContextConfig.java:308)
        at org.apache.catalina.startup.ContextConfig.start(ContextConfig.java:635)
        at
org.apache.catalina.startup.ContextConfig.lifecycleEvent(ContextConfig.java:216)
        at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
        at org.apache.catalina.core.StandardContext.start(StandardContext.java:4290)
        at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:823)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:807)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:595)
        at
org.apache.catalina.core.StandardHostDeployer.install(StandardHostDeployer.java:277)
        at org.apache.catalina.core.StandardHost.install(StandardHost.java:832)
        at
org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:701)
        at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:432)
        at org.apache.catalina.startup.HostConfig.start(HostConfig.java:983)
        at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:349)
        at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
        at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1091)
        at org.apache.catalina.core.StandardHost.start(StandardHost.java:789)
        at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1083)
        at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:478)
        at org.apache.catalina.core.StandardService.start(StandardService.java:480)
        at org.apache.catalina.core.StandardServer.start(StandardServer.java:2365)
        at org.apache.catalina.startup.Catalina.start(Catalina.java:556)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:324)
        at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:287)
        at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:425)
Mar 30, 2005 3:47:20 PM org.apache.catalina.cluster.session.DeltaManager start
WARNING: Manager[/clusterapp], requesting session state from
org.apache.catalina.cluster.mcast.McastMember[tcp://127.0.0.1:4002,127.0.0.1,4002,
alive=92933570]. This operation will timeout if no session state has been
received within 60 seconds
Mar 30, 2005 3:48:21 PM org.apache.catalina.cluster.session.DeltaManager start
SEVERE: Manager[/clusterapp], No session state received, timing out.
ClusterApp context is created.
Mar 30, 2005 3:48:21 PM org.apache.catalina.core.StandardHostDeployer install
INFO: Installing web application at context path /servlets-examples from URL
file:/opt/dev/share/jakarta/tomcat/base/webapps/servlets-examples
Mar 30, 2005 3:48:21 PM org.apache.coyote.http11.Http11Protocol start
INFO: Starting Coyote HTTP/1.1 on http-8080
Mar 30, 2005 3:48:21 PM org.apache.jk.server.JkMain start
INFO: APR not loaded, disabling jni components: java.io.IOException:
java.lang.UnsatisfiedLinkError: no jkjni in java.library.path
Mar 30, 2005 3:48:21 PM org.apache.jk.common.ChannelSocket init
INFO: JK2: ajp13 listening on /0.0.0.0:8009
Mar 30, 2005 3:48:21 PM org.apache.jk.server.JkMain start
INFO: Jk running ID=0 time=3/94 
config=/opt/dev/share/jakarta/tomcat/base/conf/jk2.properties
Mar 30, 2005 3:48:21 PM org.apache.catalina.startup.Catalina start
INFO: Server startup in 68088 ms
--------------------------------------------------------------------------------

It time out and create a new session context for my web application and when I
switch to this server, my old session data lost.

I modified the DeltaManager.java a bit, it seems solved the problem. 
--------------------------------------------------------------------------

[hzhao@donau session]$ diff DeltaManager.java
/opt/jakarta-tomcat-5.0.28-src/jakarta-tomcat-catalina/modules/cluster/src/share/org/apache/catalina/cluster/session/DeltaManager.java
632d631
<           Member mbr=null;
634,642c633
<               for(int index=0; index<cluster.getMembers().length; index++) {
<                       mbr = cluster.getMembers()[index];
<                       if (mbr.getHost().equals("127.0.0.1"))
<                               mbr = null;
<                       else
<                               break;
<               }
<           }
<           if (mbr != null) {
---
>                 Member mbr = cluster.getMembers()[0];
[hzhao@donau session]$
Comment 1 Hang Zhao 2005-04-01 04:52:46 UTC
I reviewed  the log again, obviously I am wrong, my code above is wrong too. 
The cause is that a node xxx.xx.20.218:4002 is a normal node, should be add in 
active member of cluster,  but somehow it also appeared to be 127.0.0.1:4002 
(The node on the localhost use port 4001, so this is not the local node) also 
added to the active member of cluster. and cause time out. I wonder why it may 
happen.

Sorry for if this cause any the confusion. 
Comment 2 Hang Zhao 2005-04-01 20:37:31 UTC
I checked throughly today, and find out the problem to be one of our computer in
intranet has a tomcat instance and sending out wrong information "tcp listen
address 127.0.0.1:4002". 
I changed the multicast port, and it is fine. 
This is not a bug. sorry for the confusion.