Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
It would be nice if the User Guide had a section which lists considerations for designing an end-to-end flume topology. I think a lot of people get lost in the long list of sources/sinks/etc and need a higher level overview of what to think about when designing a flow. Examples would be:
- When to use Flume? Types of data that flume is good at handling (e.g. regularly generated, event-based, etc).
- Reliability (explaining that flow reliability is a function of channels used, redundancy in the flow, and other factors)
- Flume sizing (some basic ideas about how to size the nodes or network you are running on)
The design space is too large here to give precise recommendations, but just orienting users that these are the main things they need to be thinking about would be really helpful. Some issues like reliability are much harder to explain for Flume-ng than OG, and I think we need documentation making this explicit. Down the road, a "cookbook" with specific examples would be even better.
Thoughts?