Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
0.9.3
-
None
-
This patch changes the default filename convention that the collector writes out. Output file names will now have the following format: <prefix>yyyyMMdd-HHmmssSSSSz.<12digitNanos>.<8charTid>
Description
We're transitioning to Hadoop. Until then, we're parsing the files that Flume drops on S3.
S3's API says that keys will be returned in order. It's easy to ask S3:
"Given I am on 2011-03-17/0400/flume-1.seq, give me one file."
Assuming the next lexicographically ordered file is 2011-03-17/0400/flume-2.seq, then you don't have to do any cumbersome faux-directory sweeping (since S3 doesn't know about directories per se). You can let Amazon do that work for you.
We don't have any requirements about sprintf-style formatting of the filename; just that they're written in order
Rob
Attachments
Attachments
Issue Links
- is related to
-
FLUME-48 collectorSink writes files with .seq suffix even though text files are written out.
- Closed