Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
1.15.0
-
None
Description
When running org.apache.flink.table.tpcds.TpcdsTestProgram with AdaptiveBatchScheduler, I ran into a problem:the num of records sent by the source operator is always 1, and the parallelism of source operator is also 1 even I set jobmanager.adaptive-batch-scheduler.default-source-parallelism to 8.
After some research, I found that the operator A is not the actual file reader, it just splits files and assigns splits to downstream tasks for further processing, and the operator B is the actual file reader task. Here, the parallelism of operator B is 64, and the records sent by operator A is 1, this means, operator A assigned all splits to a task of operator B, the other 63 tasks of operator B is idle, it is unreasonable.
In this case, the parallelism of operator B should be jobmanager.adaptive-batch-scheduler.default-source-parallelism and the num of records sent by operator A also should be jobmanager.adaptive-batch-scheduler.default-source-parallelism.
Attachments
Attachments
Issue Links
- is duplicated by
-
FLINK-26579 The default source parallelism configuration in adaptive batch scheduling does not take effect
- Closed
- relates to
-
FLINK-24892 FLIP-187: Adaptive Batch Scheduler
- Closed
- links to