Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
Currently we run long running python worker process for python streaming source and sink to perform planning, commit and abort in driver side. Testing indicate that current implementation cause connection timeout error when streaming query has large trigger interval
For python streaming source, keep the long running worker archaetecture but set the socket timeout to be infinity to avoid timeout error.
For python streaming sink, since StreamingWrite is also created per microbatch in scala side, long running worker cannot be attached to s StreamingWrite instance. Therefore we abandon the long running worker architecture, simply call commit() or abort() and exit the worker and allow spark to reuse worker for us.
Attachments
Issue Links
- links to