Description
The JSON and CSV writing paths buffer entire lines (or multiple lines) in memory prior to writing to disk. For large rows this is inefficient. It may make sense to skip the TextOutputFormat record writer and go directly to the underlying FSDataOutputStream, allowing the writers to append arbitrary byte arrays (fractions of a row) instead of a full row.
Attachments
Issue Links
- is duplicated by
-
SPARK-18984 Concat with ds.write.text() throw exception if column contains null data
- Resolved
- links to