Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
1.10.0
-
None
-
None
-
Description
I read the source table data of hive through flink sql, and then write the target table of hive. The target table is a partitioned table. When the data of a partition is particularly large, data skew occurs, resulting in a particularly long execution time.
By default Configuration, the same sql, hive on spark takes five minutes, and flink takes about 40 minutes.
example:
// the schema of myparttable name string, age int, PARTITIONED BY ( type string, day string ) INSERT OVERWRITE myparttable SELECT name, age, type,day from sourcetable;
Attachments
Issue Links
- is duplicated by
-
FLINK-15006 Add option to close shuffle when dynamic partition inserting
- Closed
- is related to
-
FLINK-15006 Add option to close shuffle when dynamic partition inserting
- Closed