Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
A partition is a division of a logical database or its constituent elements into distinct independent parts. Database partitioning is normally done for manageability, performance or availability reasons, or for load balancing.
Partition is widely used in hive. Especially in the ETL domain, most tables have partition attributes, which allow users to continue processing. Partition is more convenient for data management, time partitioning and business partitioning are common.
We have supported a basic version forĀ Flink table partitioning at 1.9 release, and this JIRA aims to improve partitioning support.
See https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
Attachments
Issue Links
- relates to
-
FLINK-27237 Partitioned table statement enhancement
- Open
-
FLINK-14256 [Umbrella] Introduce FileSystemTableFactory with partitioned support
- Closed
1.
|
Add hash distribution and sort grouping only when dynamic partition insert | Closed | Jingsong Lee |
|
||||||||
2.
|
Introduce FileSystemOutputFormat for batch | Closed | Jingsong Lee |
|
||||||||
3.
|
Don't use ContinuousFileReaderOperator to support multiple paths | Resolved | Jingsong Lee |
|
||||||||
4.
|
Partition field names should be got from CatalogTable instead of source/sink | Closed | Jingsong Lee |
|
||||||||
5.
|
Introduce listPartitionsByFilter to Catalog | Closed | Jingsong Lee |
|
||||||||
6.
|
Implement listPartitionsByFilter to HiveCatalog | Closed | Rui Li |
|
||||||||
7.
|
Handle escape characters for partition path | Closed | Unassigned | |||||||||
8.
|
Enable partition statistics in blink planner | Closed | Jingsong Lee |
|