Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
0.12.0, 0.14.1, 1.0.0
-
None
Description
I have been working on Spark integration with Arrow.
I realised that there are no ways to use socket as input to use Arrow stream format. For instance,
I want to something like:
connStream <- socketConnection(port = 9999, blocking = TRUE, open = "wb") rdf_slices <- # a list of data frames. stream_writer <- NULL tryCatch({ for (rdf_slice in rdf_slices) { batch <- record_batch(rdf_slice) if (is.null(stream_writer)) { stream_writer <- RecordBatchStreamWriter(connStream, batch$schema) # Here, looks there's no way to use socket. } stream_writer$write_batch(batch) } }, finally = { if (!is.null(stream_writer)) { stream_writer$close() } })
Likewise, I cannot find a way to iterate the stream batch by batch
RecordBatchStreamReader(connStream)$batches() # Here, looks there's no way to use socket.
This looks easily possible in Python side but looks missing in R APIs.
Attachments
Issue Links
- duplicates
-
ARROW-9235 [R] Support for `connection` class when reading and writing files
- Resolved