Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-600

Have collector source create names that are both lexographically and chronologically ordered

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.9.3
    • 0.9.4
    • Sinks+Sources
    • None
    • This patch changes the default filename convention that the collector writes out. Output file names will now have the following format: <prefix>yyyyMMdd-HHmmssSSSSz.<12digitNanos>.<8charTid>

    Description

      We're transitioning to Hadoop. Until then, we're parsing the files that Flume drops on S3.

      S3's API says that keys will be returned in order. It's easy to ask S3:

      "Given I am on 2011-03-17/0400/flume-1.seq, give me one file."

      Assuming the next lexicographically ordered file is 2011-03-17/0400/flume-2.seq, then you don't have to do any cumbersome faux-directory sweeping (since S3 doesn't know about directories per se). You can let Amazon do that work for you.

      We don't have any requirements about sprintf-style formatting of the filename; just that they're written in order

      Rob

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jmhsieh Jonathan Hsieh
            flume_robert.slifka@gmail.com Disabled imported user
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment