Issue Details (XML | Word | Printable)

Key: JS1-526
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Blocker Blocker
Assignee: Mark Orciuch
Reporter: Bjorn Vidar Remme
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Jetspeed

[PATCH] Hyperthreading causes registry/PSML loading errors

Created: 24/Nov/04 10:50 AM   Updated: 26/May/05 12:28 PM
Return to search
Component/s: PSML, Registry
Affects Version/s: 1.6-dev
Fix Version/s: 1.6

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works JS1-526-patch.txt 2004-11-30 01:48 PM Bjorn Vidar Remme 18 kB
Environment: Windows servers (2000/2003) with hyperthreading CPUs (or multiple CPU)

Resolution Date: 18/Jan/05 08:36 AM


 Description  « Hide
Running Jetspeed 1.6-dev (HEAD -> 2004-11-23) throws exceptions when reading xreg & psml files if the system has more than more CPU (or hyperthreading). This affects 1.6-dev only, 1.5 works perfectly.

After disabling hyperthreading Jetspeed-1.6 works perfectly (confirmed on three different servers).

Where the problem occurs is a bit random, and is affected by the server load.


The problem seems to be associated with the classes 'LateCastorRegistryService' and 'CastorPsmlManagerService'.

Please contact me for more information if needed.


Here is an example exception from the jetspeedservices log (not very informative):

2004-11-24 11:27:34,656 [http-8080-Processor25] ERROR CastorPsmlManagerService - PSMLManager: Could not unmarshal the file D:\apache\Tomcat_5.0\webapps\jetspeed-1.6-dev-20041123\WEB-INF\psml\user\anon\html\default.psml
org.xml.sax.SAXException: The class for the root element 'portlets' could not be found.
at org.exolab.castor.xml.UnmarshalHandler.startElement(UnmarshalHandler.java:595)
at org.exolab.castor.xml.util.DOMEventProducer.process(DOMEventProducer.java:245)
at org.exolab.castor.xml.util.DOMEventProducer.process(DOMEventProducer.java:182)
at org.exolab.castor.xml.util.DOMEventProducer.processChildren(DOMEventProducer.java:333)
at org.exolab.castor.xml.util.DOMEventProducer.process(DOMEventProducer.java:134)
at org.exolab.castor.xml.util.DOMEventProducer.process(DOMEventProducer.java:170)
at org.exolab.castor.xml.util.DOMEventProducer.start(DOMEventProducer.java:110)
at org.exolab.castor.xml.Unmarshaller.unmarshal(Unmarshaller.java:290)
at org.exolab.castor.xml.Unmarshaller.unmarshal(Unmarshaller.java:374)
at org.apache.jetspeed.services.psmlmanager.CastorPsmlManagerService.loadDocument(CastorPsmlManagerService.java:472)
at org.apache.jetspeed.services.psmlmanager.CastorPsmlManagerService.getDocument(CastorPsmlManagerService.java:387)
at org.apache.jetspeed.services.psmlmanager.CastorPsmlManagerService.getDocument(CastorPsmlManagerService.java:340)
at org.apache.jetspeed.services.PsmlManager.getDocument(PsmlManager.java:72)
at org.apache.jetspeed.services.profiler.JetspeedProfilerService.fallback(JetspeedProfilerService.java:734)
at org.apache.jetspeed.services.profiler.JetspeedProfilerService.fallbackProfile(JetspeedProfilerService.java:509)
at org.apache.jetspeed.services.profiler.JetspeedProfilerService.getProfile(JetspeedProfilerService.java:262)
at org.apache.jetspeed.services.profiler.JetspeedProfilerService.getProfile(JetspeedProfilerService.java:545)
at org.apache.jetspeed.services.Profiler.getProfile(Profiler.java:87)
at org.apache.jetspeed.modules.actions.JetspeedAccessController.doPerform(JetspeedAccessController.java:74)
at org.apache.turbine.modules.Action.perform(Action.java:87)
at org.apache.turbine.modules.ActionLoader.exec(ActionLoader.java:122)
at org.apache.turbine.Turbine.doGet(Turbine.java:529)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:237)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:157)
at org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:704)
at org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:474)
at org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:409)
at org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:312)
at org.apache.jasper.runtime.PageContextImpl.doForward(PageContextImpl.java:670)
at org.apache.jasper.runtime.PageContextImpl.forward(PageContextImpl.java:637)
at org.apache.jsp.index_jsp._jspService(index_jsp.java:45)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:324)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:237)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:157)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:214)
at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
at org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:198)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:152)
at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:137)
at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:118)
at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:102)
at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
at org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:929)
at org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:160)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:799)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.processConnection(Http11Protocol.java:705)
at org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:577)
at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:683)
at java.lang.Thread.run(Thread.java:534)


 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Jaq Marit added a comment - 25/Nov/04 07:05 AM
I've encountered same messages in my logs specifically with Windows. I haven't notice this in my Linux box. Could this be a VM issue?

Bjorn Vidar Remme added a comment - 25/Nov/04 08:19 AM
Interesting observation. Unfortunately it is not possible for me to test this since all my Linux boxes has all single-no-hyper-threading CPUs. I might be able to perform tests on a Linux/hyper-threaded box in a few days from now.

I might add that we are using JDK 1.4.2_06 for the moment.

Note that if you put some load on the server, then Jetspeed 1.6-dev will start and run. It is therefore vital that the server has no load what so ever when trying to reproduce this error.

And I must note that Jetspeed 1.5 is very stable on the same system. Not a single glitch!


Jaq Marit added a comment - 26/Nov/04 02:58 AM
JS1.6-dev from cvs is running perfectly in our 2 Xeon HT processors with JDK1.4.2_06. However, I can't guarantee that the server is ligthly loaded as it is a production server. I'll try my own tests in the next few days.

Bjorn Vidar Remme added a comment - 30/Nov/04 01:48 PM
Path that uses new SynvhronizedMapping object to serialize all access to castor Mapping.loadMapping() methods.

Bjorn Vidar Remme added a comment - 30/Nov/04 02:04 PM
I finally had time to investigate this problem in detail and I have created a patch that solved this issue on all our servers (attachment JS1-526-patch.txt).

To fix this I added more debug code to the CastorPSMLManagerService and the LateInitCastorRegistryService. Then I compared the logs when Jetspeed 1.6 worked and when it crashed (as I have mentioned before, it works if I generate some load on the server).

Comparing the logs I found out that the services runs in two different threads (psml in main and lateInit in Deamonthread:feeddeamon). I then discovered that the crashes occurs when both threads loads a Castor mapping at the same time. The Mapping.loadMapping() method appears not to be thread safe... Synchronizing all access to the Mapping.loadMapping() methods seems to be the solution (see new SynchronizedMapping class).

The Unmarshaller.unmarshal() methods appears to be thread safe, can anyone comment on this please?

Mark Orciuch added a comment - 18/Jan/05 08:32 AM
I encountered the same problems on dual-processing box with hyperthreading (Win2K). After applying the patch, things look much better.

Mark Orciuch added a comment - 18/Jan/05 08:36 AM
Patch applied and tested