Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
0.9.4
-
None
Description
We are experiencing some problems with Flume that randomly happen on our systems 2 or more times per day. Here is what they look like...
This is the first sensible message in logs regarding this issue:
...
2012-10-23 15:02:27,021 INFO com.cloudera.flume.handlers.debug.StubbornAppendSink: append failed on event 'xservice-backend02 [INFO Tue Oct 23 15:02:12
UTC 2012]
{ AckTag : 20121023-150211682+0000.2380890497695726.00000068 }
{ AckType : msg }
{ ds : 2012-10-23 }
{ tailSrcFile : operation.xservice-backend02.log }
2012-10-23 15:02:12,053 \tRESP\txaxpxo\tf0bdc8efc9df943380359bc4
f5a1058e76\t/0/poster/\tPOST\t\"force_rewrite\": \"1\", \"session_key\": \"f0bdc8efc9df943380359bc4f5a1058e76\", \"filesize\": \"73\", \"name\": \"/Download/
aspect/TRANSITIVE AND ARGUMENT STRUCTURE.pdf\"\t6c86d15a5593c90381ef2d0b33fa7aa9\t0.0876088142395\t3\tpc:0.992:en:lan:Virtual Drive:\tCloud_Syncer\t' w
ith error: Blocked append interrupted by rotation event
After a few seconds exceptions appear:
...
2012-10-23 15:02:31,769 INFO com.cloudera.flume.handlers.debug.InsistentAppendDecorator: Failed due to unexpected runtime exception during append attempt
java.lang.RuntimeException: Blocked append interrupted by rotation event
at com.cloudera.flume.handlers.rolling.RollSink.append(RollSink.java:215)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.core.MaskDecorator.append(MaskDecorator.java:43)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.debug.InsistentOpenDecorator.append(InsistentOpenDecorator.java:169)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.debug.StubbornAppendSink.append(StubbornAppendSink.java:71)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.debug.InsistentAppendDecorator.append(InsistentAppendDecorator.java:110)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.endtoend.AckChecksumChecker.append(AckChecksumChecker.java:172)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.batch.UnbatchingDecorator.append(UnbatchingDecorator.java:62)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.batch.GunzipDecorator.append(GunzipDecorator.java:81)
at com.cloudera.flume.collector.CollectorSink.append(CollectorSink.java:241)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.core.extractors.RegexExtractor.append(RegexExtractor.java:88)
at com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:133)
2012-10-23 15:02:31,770 WARN com.cloudera.flume.core.connector.DirectDriver: Exception in sink: RegexExtractor
java.lang.RuntimeException: Blocked append interrupted by rotation event
at com.cloudera.flume.handlers.rolling.RollSink.append(RollSink.java:215)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.core.MaskDecorator.append(MaskDecorator.java:43)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.debug.InsistentOpenDecorator.append(InsistentOpenDecorator.java:169)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.debug.StubbornAppendSink.append(StubbornAppendSink.java:71)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.debug.InsistentAppendDecorator.append(InsistentAppendDecorator.java:110)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.endtoend.AckChecksumChecker.append(AckChecksumChecker.java:172)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.batch.UnbatchingDecorator.append(UnbatchingDecorator.java:62)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.handlers.batch.GunzipDecorator.append(GunzipDecorator.java:81)
at com.cloudera.flume.collector.CollectorSink.append(CollectorSink.java:241)
at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
at com.cloudera.flume.core.extractors.RegexExtractor.append(RegexExtractor.java:88)
at com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:133)
2012-10-23 15:02:31,770 WARN com.cloudera.flume.core.connector.DirectDriver: Retrying after Error in source: RegexExtractor
2012-10-23 15:02:31,770 INFO com.cloudera.flume.handlers.rolling.RollSink: closing RollSink 'escapedCustomDfs("hdfs://cdh-master.domain.com:54310/user/flume/x_backends/ds=%
2012-10-23 15:02:31,770 INFO com.cloudera.flume.handlers.rolling.RollSink: double close 'escapedCustomDfs("hdfs://cdh-master.domain.com:54310/user/flume/x_backends/ds=%{ds}
","x_backends.log%
{rolltag}" )'2012-10-23 15:02:31,771 WARN com.cloudera.flume.handlers.endtoend.AckChecksumChecker: partial acks abandoned: {20121023-150211682+0000.2380890497695726.00000068=1350995824815}
2012-10-23 15:02:31,771 INFO com.cloudera.flume.handlers.rolling.RollSink: opening RollSink 'escapedCustomDfs("hdfs://cdh-master.domain.com:54310/user/flume/x_backends/ds=%{ds}","x_backends.log%{rolltag}
" )'
2012-10-23 15:02:31,772 INFO com.cloudera.flume.handlers.debug.InsistentOpenDecorator: Opened MaskDecorator on try 0
2012-10-23 15:02:31,772 INFO com.cloudera.flume.core.connector.DirectDriver: Sink Retry successful
2012-10-23 15:02:31,772 ERROR com.cloudera.flume.core.connector.DirectDriver: Closing down due to exception during append calls
2012-10-23 15:02:31,772 INFO com.cloudera.flume.core.connector.DirectDriver: Connector logicalNode flume_master_x_backends.log-20 exited with error: Event already had an event with attribute ds
java.lang.IllegalArgumentException: Event already had an event with attribute ds
at com.cloudera.flume.core.EventBaseImpl.set(EventBaseImpl.java:62)
at com.cloudera.flume.core.Attributes.setString(Attributes.java:112)
at com.cloudera.flume.core.extractors.RegexExtractor.append(RegexExtractor.java:87)
at com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:150)
2012-10-23 15:02:31,772 INFO com.cloudera.flume.collector.CollectorSource: closed
2012-10-23 15:02:32,773 INFO com.cloudera.flume.handlers.thrift.ThriftEventSource: Closed server on port 35854...
2012-10-23 15:02:32,773 INFO com.cloudera.flume.handlers.thrift.ThriftEventSource: Queue still has 182 elements ...
...
And no service listening on port 35854 anymore.
Our Flume configuration:
exec map flume-master.domain.com flume_master_x_backends.log
exec map xservice-backend02 backend02_x_backends.log
exec config backend02_x_backends.log x_backends.log 'tail("/mnt/log/operation.xservice-backend02.log")' 'agentSink("flume-master.domain.com", 35854)'
exec config flume_master_x_backends.log x_backends.log 'collectorSource(35854)' '{ regex("^2[0-9]
-[0-9]
{2}-[0-9]{2}",0,"ds") => collectorSink("hdfs://cdh-master.domain.com:54310/user/flume/x_backends/ds=%
{ds}", "x_backends.log") }'
The strange part is that everything worked fine for months using this particular configuration. Could you please provide some comments as this looks like a bug. We are using Flume 0.9.4-cdh3u5