Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Currently we write to a .tmp file. The problem is that if MR jobs are being run on the directory we are writing to, then it's common for an MR job to list the directory, get a .tmp file and then in the mean time the .tmp file is renamed causing the job to fail when run.
Using JavaMR you can use a PathFilter to avoid this, however a custom solution is required for Pig, Hive, etc.
Perhaps we should write to a hidden file so that MR never tries to process data in flight.
Attachments
Attachments
Issue Links
- is duplicated by
-
FLUME-1704 HDFS sink: Add option to prefix .tmp files with some string like "_" or "."
- Resolved
- supercedes
-
FLUME-1486 Ability to configure a staging directory for data
- Resolved
- links to