Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-600

Have collector source create names that are both lexographically and chronologically ordered

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.9.3
    • 0.9.4
    • Sinks+Sources
    • None
    • This patch changes the default filename convention that the collector writes out. Output file names will now have the following format: <prefix>yyyyMMdd-HHmmssSSSSz.<12digitNanos>.<8charTid>

    Description

      We're transitioning to Hadoop. Until then, we're parsing the files that Flume drops on S3.

      S3's API says that keys will be returned in order. It's easy to ask S3:

      "Given I am on 2011-03-17/0400/flume-1.seq, give me one file."

      Assuming the next lexicographically ordered file is 2011-03-17/0400/flume-2.seq, then you don't have to do any cumbersome faux-directory sweeping (since S3 doesn't know about directories per se). You can let Amazon do that work for you.

      We don't have any requirements about sprintf-style formatting of the filename; just that they're written in order

      Rob

      Attachments

        Issue Links

          Activity

            People

              jmhsieh Jonathan Hsieh
              flume_robert.slifka@gmail.com Disabled imported user
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: