Details
-
Question
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.9.0
-
None
-
None
-
flume-env.sh
export JAVA_OPTS="-Xms100m -Xmx1000m -Dcom.sun.management.jmxremote -Dflume.root.logger=INFO,console -javaagent:/opt/flume/flume/jmx_prometheus_javaagent-0.11.0.jar=5000:/opt/flume/flume/jmx_exporter.yml"
java -version
java version "1.8.0_131" Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
OS:
Linux 4.9.127-32.el7.x86_64 #1 SMP Mon Sep 17 13:40:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
flume-env.sh export JAVA_OPTS= "-Xms100m -Xmx1000m -Dcom.sun.management.jmxremote -Dflume.root.logger=INFO,console -javaagent:/opt/flume/flume/jmx_prometheus_javaagent-0.11.0.jar=5000:/opt/flume/flume/jmx_exporter.yml" java -version java version "1.8.0_131" Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode) OS : Linux 4.9.127-32.el7.x86_64 #1 SMP Mon Sep 17 13:40:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Description
We are using TAILDIR Source + Kafka Sink with following configuration:
tuzla2kafka.sources = tuzla tuzla2kafka.channels = c1 tuzla2kafka.sinks = kafka tuzla2kafka.sources.tuzla.type = TAILDIR tuzla2kafka.sources.tuzla.channels = c1 tuzla2kafka.sources.tuzla.positionFile = /data/flume/positions/tuzla2kafka-taildir_position.json tuzla2kafka.sources.tuzla.filegroups = tuzla_fluentd tuzla2kafka.sources.tuzla.filegroups.tuzla_fluentd = /data/tuzla/fluentd/event_log_production.*.log tuzla2kafka.channels.c1.type = file tuzla2kafka.channels.c1.checkpointDir = /data/flume/file_channels/c1/checkpoint tuzla2kafka.channels.c1.dataDirs = /data/flume/file_channels/c1/data tuzla2kafka.channels.c1.capacity = 1000000 tuzla2kafka.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink tuzla2kafka.sinks.kafka.channel = c1 tuzla2kafka.sinks.kafka.kafka.topic = mini-pipeline tuzla2kafka.sinks.kafka.kafka.bootstrap.servers = kafka1:9092,kafka2:9092 tuzla2kafka.sinks.kafka.kafka.batchSize = 10000 tuzla2kafka.sinks.kafka.kafka.allowTopicOverride = false
Log files in tuzla2kafka.sources.tuzla.filegroups.tuzla_fluentd are rotated hourly and each one of them ~1.5GB. We're testing this configuration for 3 days and we noticed that Flume skipped 3 files in 3 days. We were not able to see 'Opening file / Closed file' in Flume logs for these 3 files. Is this a known bug? We're trying to switch from fluentd to flume and this behaviour eliminates flume as an alternative.