Details
Description
Hi Colleagues,
We have an interesting bug in the camel-quartz2 where the firenow doesn't work consistently.
it works sometimes and sometimes it doesn't. On debug i found that the problem occurs when Scheduler starts it invokes CamleJob, CamleJob starts executing and tries to read from getProcessors() method of LoadBalancerSupport class. sometimes getProcessors() method returns empty list as a result routes doesn't get executed.
StartupListener (QuartzComponent) gets notified much before the QuartzConsumer starts, as a result it is not able to get the List<Processor> from getProcessors() method of LoadBalancerSupport class.
Analysis of the problem
--------------------------------------------------------------------------------------------------
- DefaultCamelContext method safelyStartRouteServices() notifies QuartzComponent's onCamelContextStarted() method.
- The above step starts the scheduler which then calls execute() method in CamelJob class.
- In the execute() method it invokes getProcessors() method of LoadBalancerSupport class to get the List<Processor>.
- Sometimes getProcessors() returns empty list.
- The reason is List<Processor> in LoadBalancerSupport class is populated by QuartzConsumer on start of consumer. but it is started after QuartzComponent's onCamelContextStarted() method is called in DefaultCamelContext.
- When i checked the older version of camel which we are using in our organization 2.17.0 i see the order is different.
- Consumer starts first then the classes that implements StartupListener.
To Summarize
-------------------------------------------------------------------------------------------------------
- The order in which StartupListener and Consumer is started in DefaultCamelContext is causing race condition.
- The older version of camel 2.17.0 order is first Consumer starts followed by classes that implement StartupListener
I have attached the junit with quartz2, since it is intermittent issue, i made the junit to run 1000 times and it fails randomly which proves existence of the problem.
I have also attached by proposed solution in DefaultCamelContext in the attachment.