Details
-
Task
-
Status: Resolved
-
P3
-
Resolution: Abandoned
-
None
-
None
Description
Beam has javadocs for how to create a read or write transform, but no friendly user guide on how to get started using BoundedSource/BoundedReader.
This should cover:
- background on beam's source/sink API design
- design patterns
- evaluating different data sources (eg, what are the properties of a pub sub system that affect how you should write your UnboundedSource? What is the best design for reading from a NoSql style source?)
- testing - how to write unit, integration (and once we have them, performance tests)
- public API recommendations
This is related, but not strictly overlapping with:
https://issues.apache.org/jira/browse/BEAM-193
- the Dataflow SDK documentation for "Custom Sources and Sinks" contains some info about writing Sources/Sinks, but it is somewhat out of date, and doesn't reflect the things we've learned recently.
Attachments
Issue Links
- is cloned by
-
BEAM-1026 User guide - "How to create Beam IO Transforms"
- Resolved
1.
|
Update Beam site to reflect new Pipeline I/O docs structure | Resolved | Stephen Sisk | |
2.
|
Add language-neutral overview content for Pipeline I/O authoring docs | Resolved | Stephen Sisk | |
3.
|
Move python content for Pipeline I/O authoring over into new Pipeline I/O section | Resolved | Unassigned | |
4.
|
Add java content for Pipeline I/O authoring | Resolved | Unassigned | |
5.
|
Pipeline I/O - add content on how to unit test | Resolved | Stephen Sisk | |
6.
|
Pipeline I/O docs - add content on contributing I/O transforms | Resolved | Unassigned | |
7.
|
I/O Testing docs - Integration Testing section | Resolved | Unassigned | |
8.
|
Add better documentation on testing unbounded I/O scenarios | Open | Unassigned | |
9.
|
Add better documentation on testing Python I/O transforms | Open | Unassigned |