[SPARK-24741] Have a built-in AVRO data source implementation - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.4.0
Fix Version/s: None
Component/s: SQL, Structured Streaming
Labels:
None

Target Version/s:

2.4.0

Description

Apache Avro (https://avro.apache.org) is a data serialization format. It is widely used in the Spark and Hadoop ecosystem, especially for Kafka-based data pipelines. Using Spark-Avro package (https://github.com/databricks/spark-avro), Spark SQL can read and write the avro data. Making spark-Avro built-in can provide a better experience for first-time users of Spark SQL and structured streaming. We expect the built-in Avro data source can further improve the adoption of structured streaming. We should consider inlining https://github.com/databricks/spark-avro.

Attachments

Issue Links

Is contained by

SPARK-24768 Have a built-in AVRO data source implementation

Resolved

Activity

People

Assignee:: Gengliang Wang

Reporter:: Xiao Li

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 05/Jul/18 00:26

Updated:: 13/Jul/18 15:55

Resolved:: 12/Jul/18 20:58