[FLINK-25592] Improvement of parser, optimizer and execution for Flink Batch SQL - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Table SQL / API, Table SQL / Planner, Table SQL / Runtime
Labels:
None

Description

This is a parent JIRA to track improvements on Flink Batch SQL, including parser, optimizer and execution.
For example,
1. using Hive dialect and default dialect, some sql query would be translated into different plans
2. specify hash/sort aggregate strategy and hash/sort merge join strategy in sql hint
3. take parquet metadata into consideration in optimization
4. and so on
Please note, some improvements are not limited to batch sql. Maybe streaming sql job could also benefits from some improvements in this JIRA.

Attachments

Sub-Tasks

1.	Case when would be translated into different expression in Hive dialect and default dialect	Resolved	Unassigned
2.	A redundant scan could be skipped if it is an input of join and the other input is empty after partition prune	Closed	Yunhong Zheng
3.	Take parquet metadata into consideration when source is parquet files	Open	luoyuxia
4.	Specify hash/sort aggregate strategy in SQL hint	Closed	ZhuoYu Chen
5.	Specify hash/sortmerge join in SQL hint	Open	luoyuxia
6.	Remove useless aggregate function	Open	godfrey he
7.	Batch get statistics of multiple partitions instead of get one by one	Resolved	tartarus
8.	Cannot join hive tables with different column types	Closed	Unassigned
9.	Unexpected aggregate plan after load hive module	Resolved	luoyuxia
10.	UnsupportedOperationException would thrown out when hash shuffle by a field with array type	Closed	dalongliu
11.	CalcOperator CodeGenException: Boolean expression type expected	Resolved	Unassigned
12.	Flink doesn't support Hive primitive type void yet	Closed	luoyuxia
13.	Hive Dialect support implicit conversion	Resolved	luoyuxia
14.	Unexpected rexnode : org.apache.calcite.rex.RexFieldAccess	Resolved	Unassigned
15.	CodeGenException: Unable to find common type of	Resolved	Unassigned
16.	Failed to get Hive result type from org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentileApprox	Resolved	Unassigned
17.	Field #1: values VARCHAR(2147483647) ARRAY does not exist for expression index($0, 0)	Closed	Unassigned
18.	throw NPE if multi MAPJOIN hint union all	Closed	luoyuxia
19.	Support Insert Multi-Table	Closed	luoyuxia
20.	Support Hive bucket table	In Progress	luoyuxia
21.	Flink supports all modes of Hive UDAF (PARTIAL1, PARTIAL2, FINAL, COMPLETE)	Resolved	luoyuxia
22.	Min aggregate function support type: ''ARRAY''.	Resolved	Unassigned
23.	Flink batch support for Hive StorageHandlers	Open	luoyuxia
24.	Hive dialect fails using union map type	Open	luoyuxia
25.	Add Hive partition when flink has no data to write	Closed	tartarus
26.	Fix Hive sink not write a success file after finish writing in batch mode	Closed	tartarus
27.	Allow user to configure whether to enable sort or not when it's for dynamic parition writing for HiveSource	Closed	luoyuxia

Activity

People

Assignee:: Unassigned

Reporter:: Jing Zhang

Votes:: 0 Vote for this issue

Watchers:: 21 Start watching this issue

Dates

Created:: 10/Jan/22 13:36

Updated:: 09/Aug/22 04:42