Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2155

Improve replay time

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.5.0
    • None
    • None

    Description

      File Channel has scaled so well that people now run channels with sizes in 100's of millions of events. Turns out, replay can be crazy slow even between checkpoints at this scale - because of the remove() method in FlumeEventQueue moving every pointer that follows the one being removed (1 remove causes 99 million+ moves for a channel of 100 million!). There are several ways of improving - one being move at the end of replay - sort of like a compaction. Another is to use the fact that all removes happen from the top of the queue, so move the first "k" events out to hashset and remove from there - we can find k using the write id of the last checkpoint and the current one.

      Attachments

        1. SmartReplay1.1.pdf
          86 kB
          Hari Shreedharan
        2. SmartReplay.pdf
          72 kB
          Hari Shreedharan
        3. FLUME-FC-SLOW-REPLAY-FIX-1.patch
          4 kB
          Brock Noland
        4. FLUME-FC-SLOW-REPLAY-1.patch
          11 kB
          Brock Noland
        5. FLUME-2155-initial.patch
          2 kB
          Hari Shreedharan
        6. FLUME-2155.patch
          12 kB
          Hari Shreedharan
        7. FLUME-2155.5.patch
          37 kB
          Brock Noland
        8. FLUME-2155.4.patch
          35 kB
          Brock Noland
        9. FLUME-2155.2.patch
          34 kB
          Brock Noland
        10. fc-test.patch
          12 kB
          Hari Shreedharan
        11. 700000-710000
          5.70 MB
          Hari Shreedharan
        12. 300000-310000
          5.70 MB
          Hari Shreedharan
        13. 10000-20000
          5.69 MB
          Hari Shreedharan
        14. 100000-110000
          5.70 MB
          Hari Shreedharan

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            brocknoland Brock Noland
            hshreedharan Hari Shreedharan
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment