Details
-
Improvement
-
Status: Resolved
-
P2
-
Resolution: Fixed
-
None
-
None
Description
Maintaining state in BoundedSource implementations is problematic and can lead to hard to debug errors. For example (1) pickling errors (2) errors due to a runner reusing a BoundedSource object with state.
We can try to prevent users from adding state to BoundedSource implementations in following two ways.
(1) Clearly mention in BoundedSource API that objects should not maintain transient state.
(2) Update sourcetesutils to catch source objects that maintain local state.
(2) can be done by adding a check that verifies that a source produces expected output in the presence of a re-entrant read.
i = s.read_records()
i.next()
i.next()
read the whole thing from s.read_records()
i.next() some more
Verify that 'i' produced correct output.