[SPARK-22386] Data Source V2 improvements - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: In Progress
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.3.0
Fix Version/s: None
Component/s: SQL
Labels:
- releasenotes

Target Version/s:

3.0.0

Attachments

Issue Links

depends upon

SPARK-25531 new write APIs for data source v2

Resolved

is duplicated by

SPARK-9182 filter and groupBy on DataFrames are not passed through to jdbc source

Resolved

is related to

SPARK-23521 SPIP: Standardize SQL logical plans with DataSourceV2

Resolved

relates to

SPARK-26088 DataSourceV2 should expose row count and attribute statistics

Resolved

Sub-Tasks

1.	Limit push down	Resolved	Unassigned
2.	Aggregate push down	Resolved	Unassigned
3.	add `MetadataCreationSupport` trait to separate data and metadata handling at write path	Resolved	Unassigned
4.	DataSourceV2 should use immutable trees.	Resolved	Ryan Blue
5.	DataSourceV2 should support named tables in DataFrameReader, DataFrameWriter	Resolved	Unassigned
6.	Reorganize packages in data source V2	Resolved	Gengliang Wang
7.	DataSourceV2 should apply some validation when writing.	Resolved	Unassigned
8.	DataSourceV2 should use the output commit coordinator.	Resolved	Ryan Blue
9.	DataSourceV2 readers should always produce InternalRow.	Resolved	Ryan Blue
10.	DataSourceOptions should handle path and table names to avoid confusion.	Resolved	Wenchen Fan
11.	use InternalRow in DataSourceWriter	Resolved	Wenchen Fan
12.	DataSourceV2 should provide a way to get a source's schema.	Resolved	Unassigned
13.	DataSourceV2 should not allow userSpecifiedSchema without ReadSupportWithSchema	Resolved	Ryan Blue
14.	DataSourceV2: Rename DataReaderFactory to InputPartition.	Resolved	Ryan Blue
15.	Data Source V2: Join Push Down	Resolved	Unassigned
16.	DataSourceV2 should push filters and projection at physical plan conversion	Resolved	Ryan Blue
17.	remove SupportsDeprecatedScanRow	Resolved	Wenchen Fan
18.	Add support for USING syntax for DataSourceV2	Resolved	Unassigned
19.	merge ReadSupport and ReadSupportWithSchema	Resolved	Wenchen Fan
20.	DataSourceV2: Remove SupportsPushDownCatalystFilters	Resolved	Reynold Xin
21.	DataSourceV2: Add interfaces to pass required sorting and clustering for writes	Resolved	Unassigned
22.	DataSourceV2: Structured Streaming does not respect SessionConfigSupport	Resolved	Hyukjin Kwon
23.	Avoid to create a readsupport at write path in Data Source V2	Resolved	Hyukjin Kwon
24.	Recover options and properties and pass them back into the v1 API	Open	Unassigned
25.	DataSourceV2: Add new DataFrameWriter API for v2	Resolved	Ryan Blue
26.	Pass in number of partitions to BuildWriter	Resolved	Ximo Guanter
27.	DataSource V2: API to request distribution and ordering on write	Resolved	Anton Okolnychyi
28.	Data Source V2: Remove read specific distributions	Open	Unassigned
29.	DataSource V2: Build logical writes in the optimizer	Resolved	Anton Okolnychyi
30.	DataSource V2: Inject repartition and sort nodes to satisfy required distribution and ordering	Resolved	Anton Okolnychyi
31.	DataSource V2: Use Write abstraction in StreamExecution	Resolved	Anton Okolnychyi
32.	DataSource V2: Support required distribution and ordering in SS	Resolved	Anton Okolnychyi
33.	Let AQE determine the right parallelism in DistributionAndOrderingUtils	Open	Unassigned
34.	DS V2 Aggregate push down	Resolved	Huaxin Gao
35.	Aggregate (Min/Max/Count) push down for ORC	Resolved	Cheng Su
36.	Aggregate (Min/Max/Count) push down for Parquet	Resolved	Huaxin Gao
37.	Push down group by partition column for Aggregate (Min/Max/Count) for Parquet	Resolved	Huaxin Gao
38.	Push down filter by partition column for Aggregate (Min/Max/Count) for Parquet	Resolved	Huaxin Gao
39.	Add benchmark for aggregate push down	Open	Unassigned
40.	Do not split input file for Parquet reader with aggregate push down	Resolved	Cheng Su
41.	Not log empty aggregate and group by in JDBCScan	Resolved	Huaxin Gao
42.	DataSourceV2: Distribution and ordering support V2 function in writing	Resolved	Cheng Pan

Activity

People

Assignee:: Unassigned

Reporter:: Wenchen Fan

Votes:: 9 Vote for this issue

Watchers:: 47 Start watching this issue

Dates

Created:: 29/Oct/17 12:53

Updated:: 12/Dec/22 18:11