[FLINK-31338] support infer parallelism for flink table store - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: table-store-0.3.0
Fix Version/s: table-store-0.4.0
Component/s: Table Store
Labels:
- pull-request-available

Description

When using flink to query the fts table, we can config the scan parallelism by set the scan.parallelism, but the user may do not know how much parallelism should be used, setting a too large parallelism will cause resource waste, setting the parallelism too small will cause the query to be slow, so we can add parallelism infer.

The function is enabled by default. the parallelism is equal to the number of read splits. Of course, the user can manually turn off the infer function. In order to prevent too many datafiles from causing excessive parallelism, we also set a max infer parallelism. When the infer parallelism exceeds the setting, use the max parallelism.

In addition, we also need to compare with the limit in the select query statement to get a more appropriate parallelism in the case of limit pushdown, for example we have a sql select * from table limit 1, and finally we infer the parallelism is 10, but we only one parallel is needed , besause we only need one data .

Attachments

Issue Links

links to

GitHub Pull Request #584

Activity

People

Assignee:: Unassigned

Reporter:: Jun Zhang

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 06/Mar/23 10:00

Updated:: 02/Apr/23 07:16

Resolved:: 19/Mar/23 05:36