Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
1.8.0
-
None
-
None
-
Ubuntu (AWS EC2)
Description
I found some .tmp file on my s3.
Sometimes I can find both two file with same content.I need remove the .tmp file.
For example:
flume.1512259200732.txt.gz.tmp and flume.1512259200732.txt.gz
Sometimes I can only find a .tmp file, I need rename it manually.
This is the log:
03 Dec 2017 05:28:57,641 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:251) - Creating s3a://mylogs/2017-12-03/05/flume.1512277200060.txt.gz.tmp
03 Dec 2017 05:29:28,119 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.close:393) - Closing s3a://mylogs/2017-12-03/05/flume.1512277200060.txt.gz.tmp
03 Dec 2017 05:29:38,120 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.close:400) - failed to close() HDFSWriter for file (s3a://mylogs/2017-12-03/05/flume.1512277200060.txt.gz.tmp). Exception follows.
java.io.IOException: Callable timed out after 10000 ms on file: s3a://mylogs/2017-12-03/05/flume.1512277200060.txt.gz.tmp
at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:715)
at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:397)
at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:319)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:566)
at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:401)
at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.TimeoutException
at java.util.concurrent.FutureTask.get(FutureTask.java:205)
at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:708)
... 7 more
This is my sink config:
agent.sinks.k1.type = hdfs
agent.sinks.k1.channel = c1
agent.sinks.k1.hdfs.path = s3a://mylogs/%Y-%m-%d/%H
agent.sinks.k1.hdfs.fileType = CompressedStream
agent.sinks.k1.hdfs.codeC = gzip
agent.sinks.k1.hdfs.filePrefix = flume
agent.sinks.k1.hdfs.fileSuffix = .txt.gz
agent.sinks.k1.hdfs.rollSize = 67108864
agent.sinks.k1.hdfs.rollInterval = 300
agent.sinks.k1.hdfs.rollCount = 100000
agent.sinks.k1.hdfs.batchSize = 1000
agent.sinks.k1.hdfs.useLocalTimeStamp = true