Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
1.1.0
-
None
-
None
Description
Currently HDFSEventSink only renames a .tmp file under the following conditions:
1) An attempt to write an event to the file coupled with hitting one of the three roll criteria
2) Stopping the HDFSEventSink closes all writers and thus renames all currently open .tmp files
3) If the number of max open files is hit, oler writers are closed, and thus their .tmp files get renamed
The problem that I see is if events are being routed by a path by timestamp, say day or hour, you should stop seeing any events written to that path after that timestamp has been hit. If this last event comes at an inopportune time, say 5 minutes after the last roll and you're rolling once an hour, then you could be left with an orphan .tmp file that won't get rolled until (2) or (3) hit. Unless you set the max number of open files low, that could be quite a long time.
Attachments
Issue Links
- duplicates
-
FLUME-1219 Race conditions in BucketWriter / HDFSEventSink
- Resolved
- is related to
-
FLUME-1163 HDFSEventSink leaves .tmp files in place when Flume is stopped
- Closed