Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-559

Add compression and batching features to rpcsinks and rpcsources.

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 0.9.4
    • 0.9.5
    • None
    • None

    Description

      Currently batching and compression options can be specified as data flow elements (decorators) but there are subtle issues that make them difficult to use effectively, especially in the e2e case.

      The proposal here is to add compression and batching features to the rpc sinks. This will likely require the addition of a "flush" or "sync" call to the sink/decorator interface. However, this will greatly simplify the use of these optimizations from a user perspective.

      Here are some examples:

      This is ok:

      batch(100) gzip rpcSink("xxx",1234)
      

      In the new implementation it would be something like

      rpcSink("xxx",1234, compression="gzip", batch="count(100)")
      

      Ideally the rpcSource's will be able to just accept compressed or batched data.

      Here's an example of thinks that seem inconsistent an take too long to explain (and thus is too complicated)

      Today, this should work, essentially as expected:

      agent : source | batch(100) gzip agentBESink("collector");
      collector : collectorSource | gunzip unbatch collectorSink("XXX");
      

      This works, but may not work the one would expect (in the batching buffer can get lost becuase the wal happens after batching/gziping).

      agent : source | batch(100) gzip agentE2ESink("collector");
      collector : collectorSource | gunzip unbatch collectorSink("XXX");
      

      This one will not work. (compressed events have 0 size body, acks work on bodies, thus acks are worthless).

      agent : source | batch(100) gzip agentE2ESink("collector");
      collector : collectorSource | collector(30000) { gunzip unbatch escapedCustomDfs("XXX","yyy") };
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            jmhsieh Jonathan Hsieh
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment