Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.17.0
Description
If TM is materialized before receiving confirm(), the previously uploaded queue in `FsStateChangelogWriter` will be cleared, so the local files of the completed checkpoint will not be registered again, while the JM owned files are registered before confirm(), and do not depend on the uploaded queue, so the local files are deleted, and the DFS files are still there.
We have encountered the following situation, the job cannot find the local recovery files, but can restore from the DFS files:
2023-01-18 17:21:13,412 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.runtime.taskmanager.Task [] - SlidingProcessingTimeWindows (37/48)#1 #1 (fa12cfa3b811a351e031b036b0e85d91) switched from DEPLOYING to INITIALIZING. 2023-01-18 17:21:13,440 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.runtime.state.TaskLocalStateStoreImpl [] - Found registered local state for checkpoint 11599 in subtask (2daf1d9bc9ed40ecb191303db813b0de - 0a448493b4782967b150582570326227 - 36) : TaskOperatorSubtaskStates{subtaskStatesByOperatorID={0a448493b4782967b150582570326227=SubtaskState{operatorStateFromBackend=StateObjectCollection{[]}, operatorStateFromStream=StateObjectCollection{[]}, keyedStateFromBackend=StateObjectCollection{[org.apache.flink.runtime.state.changelog.ChangelogStateBackendLocalHandle@38aa46db]}, keyedStateFromStream=StateObjectCollection{[]}, inputChannelState=StateObjectCollection{[]}, resultSubpartitionState=StateObjectCollection{[]}, stateSize=1764644202, checkpointedSize=1997682}}, isTaskDeployedAsFinished=false, isTaskFinished=false} 2023-01-18 17:21:13,442 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend [] - Getting managed memory shared cache for RocksDB. 2023-01-18 17:21:13,446 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend [] - Obtained shared RocksDB cache of size 1438814063 bytes 2023-01-18 17:21:13,447 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - Starting to restore from state handle: IncrementalLocalKeyedStateHandle{metaDataState=File State: file:/opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/vtx_0a448493b4782967b150582570326227_sti_36/chk_125/0d082666-bd31-4ebe-9977-545c0d9b18a5 [1187 bytes]} DirectoryKeyedStateHandle{directoryStateHandle=DirectoryStateHandle{directory=/opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/vtx_0a448493b4782967b150582570326227_sti_36/chk_125/b3e1d20f164d4c5baed291f5d1224183}, keyGroupRange=KeyGroupRange{startKeyGroup=96, endKeyGroup=98}} without rescaling. 2023-01-18 17:21:13,495 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - Finished restoring from state handle: IncrementalLocalKeyedStateHandle{metaDataState=File State: file:/opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/vtx_0a448493b4782967b150582570326227_sti_36/chk_125/0d082666-bd31-4ebe-9977-545c0d9b18a5 [1187 bytes]} DirectoryKeyedStateHandle{directoryStateHandle=DirectoryStateHandle{directory=/opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/vtx_0a448493b4782967b150582570326227_sti_36/chk_125/b3e1d20f164d4c5baed291f5d1224183}, keyGroupRange=KeyGroupRange{startKeyGroup=96, endKeyGroup=98}} without rescaling. 2023-01-18 17:21:13,495 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - restore rocksdb cost 48 ms. 2023-01-18 17:21:13,495 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder [] - Finished building RocksDB keyed state-backend at /opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/tmp/job_2daf1d9bc9ed40ecb191303db813b0de_op_WindowOperator_0a448493b4782967b150582570326227__37_48__uuid_2cbcf5ff-4451-4788-8762-158077c8368e. 2023-01-18 17:21:13,501 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.changelog.fs.FsStateChangelogStorage [] - createWriter for operator WindowOperator_0a448493b4782967b150582570326227_(37/48)/KeyGroupRange{startKeyGroup=96, endKeyGroup=98}: 00000000-0000-0000-0000-000000000001 2023-01-18 17:21:13,502 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation [] - read changelog handle start, total state size=190851072 . 2023-01-18 17:21:13,502 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.runtime.state.changelog.StateChangelogStorageLoader [] - Creating a changelog storage with name 'filesystem' to restore from 'ChangelogStateHandleStreamImpl'. 2023-01-18 17:21:13,529 [Source Data Fetcher for Source: KafkaWindowSource (37/48)#1] INFO org.apache.kafka.clients.Metadata [] - [Consumer clientId=xr_cl_1-36, groupId=xr_cl_1] Cluster ID: 56sVc6RESJ63Jh6BnsMjkA 2023-01-18 17:21:13,515 [SlidingProcessingTimeWindows (37/48)#1] WARN org.apache.flink.streaming.api.operators.BackendRestorerProcedure [] - Exception while restoring keyed state backend for WindowOperator_0a448493b4782967b150582570326227_(37/48) from alternative (1/2), will retry while more alternatives are available. java.lang.RuntimeException: java.io.FileNotFoundException: /opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/taskowned/cc3bac5d-020c-4ee0-8999-d661f4b9beac (No such file or directory) at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:121) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:89) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:94) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:267) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:701) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:677) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:644) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:954) [flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:923) [flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746) [flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:568) [flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at java.lang.Thread.run(Thread.java:834) [?:1.8.0_102] Caused by: java.io.FileNotFoundException: /opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/taskowned/cc3bac5d-020c-4ee0-8999-d661f4b9beac (No such file or directory) at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_102] at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_102] at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[?:1.8.0_102] at org.apache.flink.core.fs.local.LocalDataInputStream.<init>(LocalDataInputStream.java:50) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:141) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.core.fs.SafetyNetWrapperFileSystem.open(SafetyNetWrapperFileSystem.java:89) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:72) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:89) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.changelog.fs.StateChangeIteratorImpl.read(StateChangeIteratorImpl.java:42) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT] ... 21 more 2023-01-18 17:21:13,545 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend [] - Getting managed memory shared cache for RocksDB. 2023-01-18 17:21:13,545 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend [] - Obtained shared RocksDB cache of size 1438814063 bytes 2023-01-18 17:21:13,546 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - Starting to restore from state handle: IncrementalRemoteKeyedStateHandle{backendIdentifier=b3e1d20f-164d-4c5b-aed2-91f5d1224183, stateHandleId=f404ffdb-715e-4f95-a850-f459639a30e6, keyGroupRange=KeyGroupRange{startKeyGroup=96, endKeyGroup=98}, checkpointId=125, sharedState={001388.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/bd7103e0-fe66-4400-a2a2-e4f3dda01b71 [51353250 bytes], 001383.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e937d12-e1d5-4427-9401-80f5db6af2ee [67393725 bytes], 001314.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c2d374dc-88ba-4003-bad5-81590e56963d [67407704 bytes], 001403.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a7ec0f13-3ef1-4e69-a50e-071b9f6b092b [67411564 bytes], 001416.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9c82f81a-33d6-4b8f-a237-2150d0c311d8 [10391374 bytes], 001384.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b591513d-b850-4025-a126-452137b4a6fa [67397014 bytes], 001413.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e2a0408-f9b4-4875-8594-2dc59df3bc66 [5307477 bytes], 001400.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/16bf5cda-4f04-40c9-8b54-ca1c393028f9 [67953551 bytes], 001316.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9543bc5c-dd9d-47b6-9626-159559b9ee45 [67406146 bytes], 001315.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b6099b46-bbef-486a-8c2f-ef232eefdeff [67409984 bytes], 001408.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66354b67-97b2-41e8-ae48-678af553d2a7 [16861835 bytes], 001404.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a550c63b-003c-48d5-9bc2-45341ce4e641 [67413763 bytes], 001406.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/55822a0c-8448-48d2-a0c1-0bbccfaf31ca [67414722 bytes], 001317.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2b1f4dcc-76e2-4492-8470-bd512031a0ab [67407862 bytes], 001401.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/42a7b773-95d6-4a8b-bb18-64f083e9c627 [20184247 bytes], 001414.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/ec3a3d64-c0c1-4a5e-b541-dfc01a0f028d [19636706 bytes], 001385.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2ef14224-f229-4852-b97e-50268979f2a3 [67395048 bytes], 001399.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a4550160-1c0c-4bb7-9eb9-26e2a6d9ed43 [67931628 bytes], 001381.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c6278757-9048-4e9d-a67d-5b9fd46c2c4f [67404982 bytes], 001368.sst=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/4497322c-c46b-4d42-9313-1f36d929c577', dataBytes=1354}, 001407.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3308882e-bf0f-4f26-9f3a-3d3dfb05fc85 [67409775 bytes], 001405.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5d5e1529-b109-483d-afa6-5d44297bec6d [67415504 bytes], 001318.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/7f9a7aa5-db80-4448-b987-1f9fdb88f5d1 [67406909 bytes], 001386.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2cf3a080-37d5-4133-b9d2-83e2ad49fc24 [67395104 bytes], 001411.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66d3176b-2257-403a-be82-03c49162c05f [19605638 bytes], 001410.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/108bb1c1-99da-4920-acab-15f254326ed0 [11733963 bytes], 001387.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9027f480-b895-4336-80b9-dea244ce1572 [67397047 bytes], 001382.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/398f2188-6336-4efe-a014-6bd3bf2ce8c6 [67396271 bytes], 001415.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c73af6f6-18e2-424e-90f9-586809dfbba5 [1931553 bytes], 001402.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/f80462bb-eef1-4e3a-845c-97dea160e306 [67410281 bytes], 001313.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/88251386-e092-4a5e-b8bf-9c2e65148f5a [67408879 bytes]}, privateState={OPTIONS-000013=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b32ad6b5-4540-4b3e-9a82-14cebe231898', dataBytes=17286}, MANIFEST-000004=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c94b6317-b50b-4bf1-938d-83d7aad6ed6a [179781 bytes], CURRENT=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3665ebc2-52ee-4a2e-87b6-6e0cc8458c37', dataBytes=16}}, metaStateHandle=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9af5edc4-c589-4f0b-91d7-515e068a3454', dataBytes=1187}, registered=false} without rescaling. 2023-01-18 17:22:08,867 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.RocksDBStateDownloader [] - download IncrementalRemoteKeyedStateHandle{backendIdentifier=b3e1d20f-164d-4c5b-aed2-91f5d1224183, stateHandleId=f404ffdb-715e-4f95-a850-f459639a30e6, keyGroupRange=KeyGroupRange{startKeyGroup=96, endKeyGroup=98}, checkpointId=125, sharedState={001388.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/bd7103e0-fe66-4400-a2a2-e4f3dda01b71 [51353250 bytes], 001383.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e937d12-e1d5-4427-9401-80f5db6af2ee [67393725 bytes], 001314.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c2d374dc-88ba-4003-bad5-81590e56963d [67407704 bytes], 001403.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a7ec0f13-3ef1-4e69-a50e-071b9f6b092b [67411564 bytes], 001416.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9c82f81a-33d6-4b8f-a237-2150d0c311d8 [10391374 bytes], 001384.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b591513d-b850-4025-a126-452137b4a6fa [67397014 bytes], 001413.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e2a0408-f9b4-4875-8594-2dc59df3bc66 [5307477 bytes], 001400.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/16bf5cda-4f04-40c9-8b54-ca1c393028f9 [67953551 bytes], 001316.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9543bc5c-dd9d-47b6-9626-159559b9ee45 [67406146 bytes], 001315.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b6099b46-bbef-486a-8c2f-ef232eefdeff [67409984 bytes], 001408.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66354b67-97b2-41e8-ae48-678af553d2a7 [16861835 bytes], 001404.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a550c63b-003c-48d5-9bc2-45341ce4e641 [67413763 bytes], 001406.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/55822a0c-8448-48d2-a0c1-0bbccfaf31ca [67414722 bytes], 001317.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2b1f4dcc-76e2-4492-8470-bd512031a0ab [67407862 bytes], 001401.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/42a7b773-95d6-4a8b-bb18-64f083e9c627 [20184247 bytes], 001414.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/ec3a3d64-c0c1-4a5e-b541-dfc01a0f028d [19636706 bytes], 001385.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2ef14224-f229-4852-b97e-50268979f2a3 [67395048 bytes], 001399.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a4550160-1c0c-4bb7-9eb9-26e2a6d9ed43 [67931628 bytes], 001381.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c6278757-9048-4e9d-a67d-5b9fd46c2c4f [67404982 bytes], 001368.sst=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/4497322c-c46b-4d42-9313-1f36d929c577', dataBytes=1354}, 001407.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3308882e-bf0f-4f26-9f3a-3d3dfb05fc85 [67409775 bytes], 001405.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5d5e1529-b109-483d-afa6-5d44297bec6d [67415504 bytes], 001318.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/7f9a7aa5-db80-4448-b987-1f9fdb88f5d1 [67406909 bytes], 001386.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2cf3a080-37d5-4133-b9d2-83e2ad49fc24 [67395104 bytes], 001411.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66d3176b-2257-403a-be82-03c49162c05f [19605638 bytes], 001410.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/108bb1c1-99da-4920-acab-15f254326ed0 [11733963 bytes], 001387.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9027f480-b895-4336-80b9-dea244ce1572 [67397047 bytes], 001382.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/398f2188-6336-4efe-a014-6bd3bf2ce8c6 [67396271 bytes], 001415.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c73af6f6-18e2-424e-90f9-586809dfbba5 [1931553 bytes], 001402.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/f80462bb-eef1-4e3a-845c-97dea160e306 [67410281 bytes], 001313.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/88251386-e092-4a5e-b8bf-9c2e65148f5a [67408879 bytes]}, privateState={OPTIONS-000013=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b32ad6b5-4540-4b3e-9a82-14cebe231898', dataBytes=17286}, MANIFEST-000004=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c94b6317-b50b-4bf1-938d-83d7aad6ed6a [179781 bytes], CURRENT=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3665ebc2-52ee-4a2e-87b6-6e0cc8458c37', dataBytes=16}}, metaStateHandle=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9af5edc4-c589-4f0b-91d7-515e068a3454', dataBytes=1187}, registered=false}, state size = 1573793130, cost 55319 ms. 2023-01-18 17:22:08,909 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - Finished restoring from state handle: IncrementalRemoteKeyedStateHandle{backendIdentifier=b3e1d20f-164d-4c5b-aed2-91f5d1224183, stateHandleId=f404ffdb-715e-4f95-a850-f459639a30e6, keyGroupRange=KeyGroupRange{startKeyGroup=96, endKeyGroup=98}, checkpointId=125, sharedState={001388.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/bd7103e0-fe66-4400-a2a2-e4f3dda01b71 [51353250 bytes], 001383.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e937d12-e1d5-4427-9401-80f5db6af2ee [67393725 bytes], 001314.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c2d374dc-88ba-4003-bad5-81590e56963d [67407704 bytes], 001403.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a7ec0f13-3ef1-4e69-a50e-071b9f6b092b [67411564 bytes], 001416.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9c82f81a-33d6-4b8f-a237-2150d0c311d8 [10391374 bytes], 001384.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b591513d-b850-4025-a126-452137b4a6fa [67397014 bytes], 001413.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e2a0408-f9b4-4875-8594-2dc59df3bc66 [5307477 bytes], 001400.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/16bf5cda-4f04-40c9-8b54-ca1c393028f9 [67953551 bytes], 001316.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9543bc5c-dd9d-47b6-9626-159559b9ee45 [67406146 bytes], 001315.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b6099b46-bbef-486a-8c2f-ef232eefdeff [67409984 bytes], 001408.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66354b67-97b2-41e8-ae48-678af553d2a7 [16861835 bytes], 001404.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a550c63b-003c-48d5-9bc2-45341ce4e641 [67413763 bytes], 001406.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/55822a0c-8448-48d2-a0c1-0bbccfaf31ca [67414722 bytes], 001317.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2b1f4dcc-76e2-4492-8470-bd512031a0ab [67407862 bytes], 001401.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/42a7b773-95d6-4a8b-bb18-64f083e9c627 [20184247 bytes], 001414.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/ec3a3d64-c0c1-4a5e-b541-dfc01a0f028d [19636706 bytes], 001385.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2ef14224-f229-4852-b97e-50268979f2a3 [67395048 bytes], 001399.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a4550160-1c0c-4bb7-9eb9-26e2a6d9ed43 [67931628 bytes], 001381.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c6278757-9048-4e9d-a67d-5b9fd46c2c4f [67404982 bytes], 001368.sst=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/4497322c-c46b-4d42-9313-1f36d929c577', dataBytes=1354}, 001407.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3308882e-bf0f-4f26-9f3a-3d3dfb05fc85 [67409775 bytes], 001405.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5d5e1529-b109-483d-afa6-5d44297bec6d [67415504 bytes], 001318.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/7f9a7aa5-db80-4448-b987-1f9fdb88f5d1 [67406909 bytes], 001386.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2cf3a080-37d5-4133-b9d2-83e2ad49fc24 [67395104 bytes], 001411.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66d3176b-2257-403a-be82-03c49162c05f [19605638 bytes], 001410.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/108bb1c1-99da-4920-acab-15f254326ed0 [11733963 bytes], 001387.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9027f480-b895-4336-80b9-dea244ce1572 [67397047 bytes], 001382.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/398f2188-6336-4efe-a014-6bd3bf2ce8c6 [67396271 bytes], 001415.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c73af6f6-18e2-424e-90f9-586809dfbba5 [1931553 bytes], 001402.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/f80462bb-eef1-4e3a-845c-97dea160e306 [67410281 bytes], 001313.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/88251386-e092-4a5e-b8bf-9c2e65148f5a [67408879 bytes]}, privateState={OPTIONS-000013=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b32ad6b5-4540-4b3e-9a82-14cebe231898', dataBytes=17286}, MANIFEST-000004=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c94b6317-b50b-4bf1-938d-83d7aad6ed6a [179781 bytes], CURRENT=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3665ebc2-52ee-4a2e-87b6-6e0cc8458c37', dataBytes=16}}, metaStateHandle=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9af5edc4-c589-4f0b-91d7-515e068a3454', dataBytes=1187}, registered=false} without rescaling. 2023-01-18 17:22:08,911 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - restore rocksdb cost 55365 ms. 2023-01-18 17:22:08,912 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder [] - Finished building RocksDB keyed state-backend at /opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/tmp/job_2daf1d9bc9ed40ecb191303db813b0de_op_WindowOperator_0a448493b4782967b150582570326227__37_48__uuid_1404e597-c96c-4d7c-99b7-303fd98f80bd. 2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'ChangelogStateBackend.lastFullSizeOfMaterialization'. Metric will not be reported.[192.168.32.162, taskmanager, job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31, Flink Streaming Job, SlidingProcessingTimeWindows, 36] 2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'ChangelogStateBackend.lastIncSizeOfMaterialization'. Metric will not be reported.[192.168.32.162, taskmanager, job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31, Flink Streaming Job, SlidingProcessingTimeWindows, 36] 2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'ChangelogStateBackend.lastFullSizeOfNonMaterialization'. Metric will not be reported.[192.168.32.162, taskmanager, job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31, Flink Streaming Job, SlidingProcessingTimeWindows, 36] 2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] WARN org.apache.flink.metrics.MetricGroup [] - Name collision: Group already contains a Metric with the name 'ChangelogStateBackend.lastIncSizeOfNonMaterialization'. Metric will not be reported.[192.168.32.162, taskmanager, job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31, Flink Streaming Job, SlidingProcessingTimeWindows, 36] 2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.changelog.fs.FsStateChangelogStorage [] - createWriter for operator WindowOperator_0a448493b4782967b150582570326227_(37/48)/KeyGroupRange{startKeyGroup=96, endKeyGroup=98}: 00000000-0000-0000-0000-000000000002 2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation [] - read changelog handle start, total state size=190851072 . 2023-01-18 17:22:08,919 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.fs.osshadoop.StsFetcherCredentialsProvider [] - Old credential is going to expire. Fetch a new one. 2023-01-18 17:22:38,158 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation [] - read read changelog handle end, cost 29243 ms. 2023-01-18 17:22:38,158 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.state.common.PeriodicMaterializationManager [] - Task SlidingProcessingTimeWindows (37/48)#1 starts periodic materialization 2023-01-18 17:22:38,158 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.state.common.PeriodicMaterializationManager [] - Task SlidingProcessingTimeWindows (37/48)#1 schedules the next materialization in 82 seconds 2023-01-18 17:22:38,176 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.runtime.taskmanager.Task [] - SlidingProcessingTimeWindows (37/48)#1 #1 (fa12cfa3b811a351e031b036b0e85d91) switched from INITIALIZING to RUNNING. 2023-01-18 17:22:39,057 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.state.changelog.ChangelogKeyedStateBackend [] - snapshot of SlidingProcessingTimeWindows (37/48)#1 for checkpoint 11601, change range: 0..2, materialization ID 125 2023-01-18 17:22:43,779 [Source Data Fetcher for Source: KafkaWindowSource (37/48)#1] INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator [] - [Consumer clientId=xr_cl_1-36, groupId=xr_cl_1] Discovered group coordinator 192.168.47.158:9092 (id: 2147483546 rack: null) 2023-01-18 17:22:44,100 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.state.changelog.ChangelogKeyedStateBackend [] - snapshot of SlidingProcessingTimeWindows (37/48)#1 for checkpoint 11602, change range: 0..11, materialization ID 125 2023-01-18 17:22:47,531 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.state.changelog.ChangelogKeyedStateBackend [] - snapshot of SlidingProcessingTimeWindows (37/48)#1 for checkpoint 11603, change range: 0..17, materialization ID 125 2023-01-18 17:22:50,837 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.state.changelog.ChangelogKeyedStateBackend [] - snapshot of SlidingProcessingTimeWindows (37/48)#1 for checkpoint 11604, change range: 0..21, materialization ID 125 2023-01-18 17:22:53,580 [SlidingProcessingTimeWindows (37/48)#1] INFO org.apache.flink.state.changelog.ChangelogKeyedStateBackend [] - snapshot of SlidingProcessingTimeWindows (37/48)#1 for checkpoint 11605, change range: 0..23, materialization ID 125
The above log can be simplified to the following scenario:
- cp1 trigger: file1,file1'(local) - JM: register [file1] to sharedRegistry - cp1 complete: stopTracking [file1], register [file1'] to localRegistry - cp2 trigger: file1,file1',file2,file2' - JM: register [file1,file2] to sharedRegistry - cp2 complete: stopTracking [file1, file2], register [file1',file2'] to localRegistry - cp1 subsume - cp3 trigger: file1,file1',file2,file2',file3,file3' - materialization: uploaded.clear() - JM: register [file1,file2,file3] to sharedRegistry - cp3 complete: stopTracking [file3], register [file3] to localRegistry - cp2 subsume: [file1', file2'] are discarded - if restore from cp3: local file1',file2' are not found
Attachments
Attachments
Issue Links
- links to