Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.4.0
-
None
-
None
Description
Apache Avro (https://avro.apache.org) is a data serialization format. It is widely used in the Spark and Hadoop ecosystem, especially for Kafka-based data pipelines. Using Spark-Avro package (https://github.com/databricks/spark-avro), Spark SQL can read and write the avro data. Making spark-Avro built-in can provide a better experience for first-time users of Spark SQL and structured streaming. We expect the built-in Avro data source can further improve the adoption of structured streaming. We should consider inlining https://github.com/databricks/spark-avro.
Attachments
Issue Links
- Is contained by
-
SPARK-24768 Have a built-in AVRO data source implementation
- Resolved