Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Abandoned
-
5.8.0
-
None
-
None
-
Java 1.6.0 most releases, tested on lots of different hardware, tested on Linux distros only.
Description
Similar to AMQ-4339
If the index is corrupted, ActiveMQ may receive an EOFException when the first message is published. The easiest way to replicate this for us has been to purposely put junk into a journal as so:
truncate -s 32700 db.data
The exception in this case has been (this error is specific to AMQ 5.6.0, so lines may mismatch on 5.8.0, but similar problem was confirmed on 5.8.0):
2013-03-01 16:05:06,451 ERROR [kahadb.MessageDatabase] KahaDB failed to store to Journal java.io.EOFException at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377) at org.apache.kahadb.page.PageFile.readPage(PageFile.java:867) at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) at org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:437) at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) at org.apache.kahadb.page.Transaction.load(Transaction.java:410) at org.apache.kahadb.page.Transaction.load(Transaction.java:367) at org.apache.kahadb.index.BTreeIndex.loadNode(BTreeIndex.java:262) at org.apache.kahadb.index.BTreeIndex.getRoot(BTreeIndex.java:174) at org.apache.kahadb.index.BTreeIndex.put(BTreeIndex.java:189) at org.apache.activemq.store.kahadb.MessageDatabase.upadateIndex(MessageDatabase.java:1197) at org.apache.activemq.store.kahadb.MessageDatabase$14.execute(MessageDatabase.java:1027) at org.apache.kahadb.page.Transaction.execute(Transaction.java:769) at org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:1025) at org.apache.activemq.store.kahadb.MessageDatabase$13.visit(MessageDatabase.java:972) at org.apache.activemq.store.kahadb.data.KahaAddMessageCommand.visit(KahaAddMessageCommand.java:241) at org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:969) at org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:881) at org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:863) at org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.addMessage(KahaDBStore.java:431) at org.apache.activemq.store.kahadb.KahaDBStore$StoreQueueTask.run(KahaDBStore.java:1177) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:680) 2013-03-01 16:05:06,743 WARN [server.AbstractHttpConnection] /v2/commands/ org.springframework.jms.UncategorizedJmsException: Uncategorized exception occured during JMS processing; nested exception is javax.jms.JMSException: java.io.EOFException at org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:316) at org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:168) at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:469) at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:543) at org.springframework.jms.core.JmsTemplate.convertAndSend(JmsTemplate.java:653) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) at clamq.jms$jms_producer$reify__3138.publish(jms.clj:29) at clamq.jms$jms_producer$reify__3138.publish(jms.clj:30) at clamq.protocol.producer$eval3094$fn__3095$G__3086__3103.invoke(producer.clj:3) at clamq.protocol.producer$eval3094$fn__3095$G__3085__3112.invoke(producer.clj:3) at clojure.lang.AFn.applyToHelper(AFn.java:167) at clojure.lang.AFn.applyTo(AFn.java:151) at clojure.core$apply.invoke(core.clj:603) at com.puppetlabs.mq$connect_and_publish_BANG_.doInvoke(mq.clj:136) at clojure.lang.RestFn.invoke(RestFn.java:439) at com.puppetlabs.puppetdb.command$enqueue_raw_command_BANG_$fn__3754.invoke(command.clj:254) at com.puppetlabs.puppetdb.command$enqueue_raw_command_BANG_.invoke(command.clj:253) at com.puppetlabs.puppetdb.http.v1.command$enqueue_command.invoke(command.clj:22) at com.puppetlabs.middleware$verify_accepts_content_type$fn__4240.invoke(middleware.clj:67) at com.puppetlabs.middleware$verify_checksum$fn__4249.invoke(middleware.clj:102) at com.puppetlabs.middleware$verify_param_exists$fn__4244.invoke(middleware.clj:79) at com.puppetlabs.puppetdb.http.v1.command$command_app.invoke(command.clj:27) at com.puppetlabs.puppetdb.http.v2$v2_app$fn__6640.invoke(v2.clj:12) at net.cgrand.moustache$alter_request$fn__4450.invoke(moustache.clj:54) at com.puppetlabs.puppetdb.http.v2$v2_app.invoke(v2.clj:12) at com.puppetlabs.puppetdb.http.server$routes$fn__7042.invoke(server.clj:27) at net.cgrand.moustache$alter_request$fn__4450.invoke(moustache.clj:54) at com.puppetlabs.puppetdb.http.server$routes.invoke(server.clj:27) at ring.middleware.resource$wrap_resource$fn__7022.invoke(resource.clj:14) at ring.middleware.params$wrap_params$fn__4400.invoke(params.clj:55) at com.puppetlabs.middleware$wrap_with_authorization$fn__4226.invoke(middleware.clj:21) at com.puppetlabs.middleware$wrap_with_certificate_cn$fn__4230.invoke(middleware.clj:36) at com.puppetlabs.middleware$wrap_with_default_body$fn__4233.invoke(middleware.clj:43) at com.puppetlabs.middleware$wrap_with_metrics_STAR_$fn__4253$fn__4254.invoke(middleware.clj:119) at com.puppetlabs.middleware.proxy$java.lang.Object$Callable$f8c5758f.call(Unknown Source) at com.yammer.metrics.core.Timer.time(Timer.java:91) at com.puppetlabs.middleware$wrap_with_metrics_STAR_$fn__4253.invoke(middleware.clj:117) at com.puppetlabs.middleware$wrap_with_globals$fn__4236.invoke(middleware.clj:54) at ring.adapter.jetty$proxy_handler$fn__4071.invoke(jetty.clj:18) at ring.adapter.jetty.proxy$org.eclipse.jetty.server.handler.AbstractHandler$0.handle(Unknown Source) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111) at org.eclipse.jetty.server.Server.handle(Server.java:349) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:452) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:894) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:948) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:857) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:76) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:609) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:45) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534) at java.lang.Thread.run(Thread.java:680) Caused by: javax.jms.JMSException: java.io.EOFException at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49) at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1362) at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1290) at org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1785) at org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:277) at org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:212) at org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:269) at org.springframework.jms.connection.CachedMessageProducer.send(CachedMessageProducer.java:117) at org.springframework.jms.core.JmsTemplate.doSend(JmsTemplate.java:592) at org.springframework.jms.core.JmsTemplate.doSend(JmsTemplate.java:569) at org.springframework.jms.core.JmsTemplate$4.doInJms(JmsTemplate.java:546) at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:466) ... 55 more Caused by: java.util.concurrent.ExecutionException: java.io.EOFException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.activemq.broker.region.Queue.doMessageSend(Queue.java:785) at org.apache.activemq.broker.region.Queue.send(Queue.java:707) at org.apache.activemq.broker.region.AbstractRegion.send(AbstractRegion.java:407) at org.apache.activemq.broker.region.RegionBroker.send(RegionBroker.java:503) at org.apache.activemq.broker.jmx.ManagedRegionBroker.send(ManagedRegionBroker.java:305) at org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:129) at org.apache.activemq.broker.scheduler.SchedulerBroker.send(SchedulerBroker.java:189) at org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:129) at org.apache.activemq.broker.CompositeDestinationBroker.send(CompositeDestinationBroker.java:96) at org.apache.activemq.broker.TransactionBroker.send(TransactionBroker.java:306) at org.apache.activemq.broker.MutableBrokerFilter.send(MutableBrokerFilter.java:135) at org.apache.activemq.broker.TransportConnection.processMessage(TransportConnection.java:453) at org.apache.activemq.command.ActiveMQMessage.visit(ActiveMQMessage.java:681) at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:292) at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:150) at org.apache.activemq.transport.ResponseCorrelator.onCommand(ResponseCorrelator.java:116) at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50) at org.apache.activemq.transport.vm.VMTransport.iterate(VMTransport.java:231) at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122) at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) ... 1 more Caused by: java.io.EOFException at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377) at org.apache.kahadb.page.PageFile.readPage(PageFile.java:867) at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) at org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:437) at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) at org.apache.kahadb.page.Transaction.load(Transaction.java:410) at org.apache.kahadb.page.Transaction.load(Transaction.java:367) at org.apache.kahadb.index.BTreeIndex.loadNode(BTreeIndex.java:262) at org.apache.kahadb.index.BTreeIndex.getRoot(BTreeIndex.java:174) at org.apache.kahadb.index.BTreeIndex.put(BTreeIndex.java:189) at org.apache.activemq.store.kahadb.MessageDatabase.upadateIndex(MessageDatabase.java:1197) at org.apache.activemq.store.kahadb.MessageDatabase$14.execute(MessageDatabase.java:1027) at org.apache.kahadb.page.Transaction.execute(Transaction.java:769) at org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:1025) at org.apache.activemq.store.kahadb.MessageDatabase$13.visit(MessageDatabase.java:972) at org.apache.activemq.store.kahadb.data.KahaAddMessageCommand.visit(KahaAddMessageCommand.java:241) at org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:969) at org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:881) at org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:863) at org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.addMessage(KahaDBStore.java:431) at org.apache.activemq.store.kahadb.KahaDBStore$StoreQueueTask.run(KahaDBStore.java:1177) ... 3 more
At the moment the only work-around I have is to blank out the db.data and let it get recreated from the journal.
I just want to stress that the corruption wasn't caused by KahaDB, at least we haven't seen any cases. The corruptions were either caused by: disk filling up, or bad copies when migrating directories etc.