Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-30863

Register local recovery files of changelog before notifyCheckpointComplete()

    XMLWordPrintableJSON

Details

    Description

      If TM is materialized before receiving confirm(), the previously uploaded queue in `FsStateChangelogWriter` will be cleared, so the local files of the completed checkpoint will not be registered again, while the JM owned files are registered before confirm(), and do not depend on the uploaded queue, so the local files are deleted, and the DFS files are still there. 

       

      We have encountered the following situation, the job cannot find the local recovery files, but can restore from the DFS files:

      2023-01-18 17:21:13,412 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.runtime.taskmanager.Task                    [] - SlidingProcessingTimeWindows (37/48)#1 #1 (fa12cfa3b811a351e031b036b0e85d91) switched from DEPLOYING to INITIALIZING.
      2023-01-18 17:21:13,440 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.runtime.state.TaskLocalStateStoreImpl       [] - Found registered local state for checkpoint 11599 in subtask (2daf1d9bc9ed40ecb191303db813b0de - 0a448493b4782967b150582570326227 - 36) : TaskOperatorSubtaskStates{subtaskStatesByOperatorID={0a448493b4782967b150582570326227=SubtaskState{operatorStateFromBackend=StateObjectCollection{[]}, operatorStateFromStream=StateObjectCollection{[]}, keyedStateFromBackend=StateObjectCollection{[org.apache.flink.runtime.state.changelog.ChangelogStateBackendLocalHandle@38aa46db]}, keyedStateFromStream=StateObjectCollection{[]}, inputChannelState=StateObjectCollection{[]}, resultSubpartitionState=StateObjectCollection{[]}, stateSize=1764644202, checkpointedSize=1997682}}, isTaskDeployedAsFinished=false, isTaskFinished=false}
      2023-01-18 17:21:13,442 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend [] - Getting managed memory shared cache for RocksDB.
      2023-01-18 17:21:13,446 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend [] - Obtained shared RocksDB cache of size 1438814063 bytes
      2023-01-18 17:21:13,447 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - Starting to restore from state handle: IncrementalLocalKeyedStateHandle{metaDataState=File State: file:/opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/vtx_0a448493b4782967b150582570326227_sti_36/chk_125/0d082666-bd31-4ebe-9977-545c0d9b18a5 [1187 bytes]} DirectoryKeyedStateHandle{directoryStateHandle=DirectoryStateHandle{directory=/opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/vtx_0a448493b4782967b150582570326227_sti_36/chk_125/b3e1d20f164d4c5baed291f5d1224183}, keyGroupRange=KeyGroupRange{startKeyGroup=96, endKeyGroup=98}} without rescaling.
      2023-01-18 17:21:13,495 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - Finished restoring from state handle: IncrementalLocalKeyedStateHandle{metaDataState=File State: file:/opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/vtx_0a448493b4782967b150582570326227_sti_36/chk_125/0d082666-bd31-4ebe-9977-545c0d9b18a5 [1187 bytes]} DirectoryKeyedStateHandle{directoryStateHandle=DirectoryStateHandle{directory=/opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/vtx_0a448493b4782967b150582570326227_sti_36/chk_125/b3e1d20f164d4c5baed291f5d1224183}, keyGroupRange=KeyGroupRange{startKeyGroup=96, endKeyGroup=98}} without rescaling.
      2023-01-18 17:21:13,495 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - restore rocksdb cost 48 ms.
      2023-01-18 17:21:13,495 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder [] - Finished building RocksDB keyed state-backend at /opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/tmp/job_2daf1d9bc9ed40ecb191303db813b0de_op_WindowOperator_0a448493b4782967b150582570326227__37_48__uuid_2cbcf5ff-4451-4788-8762-158077c8368e.
      2023-01-18 17:21:13,501 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.changelog.fs.FsStateChangelogStorage        [] - createWriter for operator WindowOperator_0a448493b4782967b150582570326227_(37/48)/KeyGroupRange{startKeyGroup=96, endKeyGroup=98}: 00000000-0000-0000-0000-000000000001
      2023-01-18 17:21:13,502 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation [] - read changelog handle start, total state size=190851072 .
      2023-01-18 17:21:13,502 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.runtime.state.changelog.StateChangelogStorageLoader [] - Creating a changelog storage with name 'filesystem' to restore from 'ChangelogStateHandleStreamImpl'.
      2023-01-18 17:21:13,529 [Source Data Fetcher for Source: KafkaWindowSource (37/48)#1] INFO  org.apache.kafka.clients.Metadata                            [] - [Consumer clientId=xr_cl_1-36, groupId=xr_cl_1] Cluster ID: 56sVc6RESJ63Jh6BnsMjkA
      2023-01-18 17:21:13,515 [SlidingProcessingTimeWindows (37/48)#1] WARN  org.apache.flink.streaming.api.operators.BackendRestorerProcedure [] - Exception while restoring keyed state backend for WindowOperator_0a448493b4782967b150582570326227_(37/48) from alternative (1/2), will retry while more alternatives are available.
      java.lang.RuntimeException: java.io.FileNotFoundException: /opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/taskowned/cc3bac5d-020c-4ee0-8999-d661f4b9beac (No such file or directory)
          at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:121) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:89) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:94) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:267) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:701) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:677) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:644) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:954) [flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:923) [flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746) [flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.runtime.taskmanager.Task.run(Task.java:568) [flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at java.lang.Thread.run(Thread.java:834) [?:1.8.0_102]
      Caused by: java.io.FileNotFoundException: /opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/localState/aid_45af7e6b612dad10b60554d81323d5f3/jid_2daf1d9bc9ed40ecb191303db813b0de/taskowned/cc3bac5d-020c-4ee0-8999-d661f4b9beac (No such file or directory)
          at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_102]
          at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_102]
          at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[?:1.8.0_102]
          at org.apache.flink.core.fs.local.LocalDataInputStream.<init>(LocalDataInputStream.java:50) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:141) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.core.fs.SafetyNetWrapperFileSystem.open(SafetyNetWrapperFileSystem.java:89) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:72) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:89) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.changelog.fs.StateChangeIteratorImpl.read(StateChangeIteratorImpl.java:42) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85) ~[flink-dist_2.12-1.15-vvr-6.0-SNAPSHOT.jar:1.15-vvr-6.0-SNAPSHOT]
          ... 21 more
      2023-01-18 17:21:13,545 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend [] - Getting managed memory shared cache for RocksDB.
      2023-01-18 17:21:13,545 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend [] - Obtained shared RocksDB cache of size 1438814063 bytes
      2023-01-18 17:21:13,546 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - Starting to restore from state handle: IncrementalRemoteKeyedStateHandle{backendIdentifier=b3e1d20f-164d-4c5b-aed2-91f5d1224183, stateHandleId=f404ffdb-715e-4f95-a850-f459639a30e6, keyGroupRange=KeyGroupRange{startKeyGroup=96, endKeyGroup=98}, checkpointId=125, sharedState={001388.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/bd7103e0-fe66-4400-a2a2-e4f3dda01b71 [51353250 bytes], 001383.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e937d12-e1d5-4427-9401-80f5db6af2ee [67393725 bytes], 001314.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c2d374dc-88ba-4003-bad5-81590e56963d [67407704 bytes], 001403.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a7ec0f13-3ef1-4e69-a50e-071b9f6b092b [67411564 bytes], 001416.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9c82f81a-33d6-4b8f-a237-2150d0c311d8 [10391374 bytes], 001384.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b591513d-b850-4025-a126-452137b4a6fa [67397014 bytes], 001413.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e2a0408-f9b4-4875-8594-2dc59df3bc66 [5307477 bytes], 001400.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/16bf5cda-4f04-40c9-8b54-ca1c393028f9 [67953551 bytes], 001316.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9543bc5c-dd9d-47b6-9626-159559b9ee45 [67406146 bytes], 001315.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b6099b46-bbef-486a-8c2f-ef232eefdeff [67409984 bytes], 001408.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66354b67-97b2-41e8-ae48-678af553d2a7 [16861835 bytes], 001404.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a550c63b-003c-48d5-9bc2-45341ce4e641 [67413763 bytes], 001406.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/55822a0c-8448-48d2-a0c1-0bbccfaf31ca [67414722 bytes], 001317.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2b1f4dcc-76e2-4492-8470-bd512031a0ab [67407862 bytes], 001401.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/42a7b773-95d6-4a8b-bb18-64f083e9c627 [20184247 bytes], 001414.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/ec3a3d64-c0c1-4a5e-b541-dfc01a0f028d [19636706 bytes], 001385.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2ef14224-f229-4852-b97e-50268979f2a3 [67395048 bytes], 001399.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a4550160-1c0c-4bb7-9eb9-26e2a6d9ed43 [67931628 bytes], 001381.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c6278757-9048-4e9d-a67d-5b9fd46c2c4f [67404982 bytes], 001368.sst=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/4497322c-c46b-4d42-9313-1f36d929c577', dataBytes=1354}, 001407.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3308882e-bf0f-4f26-9f3a-3d3dfb05fc85 [67409775 bytes], 001405.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5d5e1529-b109-483d-afa6-5d44297bec6d [67415504 bytes], 001318.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/7f9a7aa5-db80-4448-b987-1f9fdb88f5d1 [67406909 bytes], 001386.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2cf3a080-37d5-4133-b9d2-83e2ad49fc24 [67395104 bytes], 001411.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66d3176b-2257-403a-be82-03c49162c05f [19605638 bytes], 001410.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/108bb1c1-99da-4920-acab-15f254326ed0 [11733963 bytes], 001387.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9027f480-b895-4336-80b9-dea244ce1572 [67397047 bytes], 001382.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/398f2188-6336-4efe-a014-6bd3bf2ce8c6 [67396271 bytes], 001415.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c73af6f6-18e2-424e-90f9-586809dfbba5 [1931553 bytes], 001402.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/f80462bb-eef1-4e3a-845c-97dea160e306 [67410281 bytes], 001313.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/88251386-e092-4a5e-b8bf-9c2e65148f5a [67408879 bytes]}, privateState={OPTIONS-000013=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b32ad6b5-4540-4b3e-9a82-14cebe231898', dataBytes=17286}, MANIFEST-000004=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c94b6317-b50b-4bf1-938d-83d7aad6ed6a [179781 bytes], CURRENT=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3665ebc2-52ee-4a2e-87b6-6e0cc8458c37', dataBytes=16}}, metaStateHandle=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9af5edc4-c589-4f0b-91d7-515e068a3454', dataBytes=1187}, registered=false} without rescaling.
      2023-01-18 17:22:08,867 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.RocksDBStateDownloader [] - download IncrementalRemoteKeyedStateHandle{backendIdentifier=b3e1d20f-164d-4c5b-aed2-91f5d1224183, stateHandleId=f404ffdb-715e-4f95-a850-f459639a30e6, keyGroupRange=KeyGroupRange{startKeyGroup=96, endKeyGroup=98}, checkpointId=125, sharedState={001388.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/bd7103e0-fe66-4400-a2a2-e4f3dda01b71 [51353250 bytes], 001383.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e937d12-e1d5-4427-9401-80f5db6af2ee [67393725 bytes], 001314.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c2d374dc-88ba-4003-bad5-81590e56963d [67407704 bytes], 001403.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a7ec0f13-3ef1-4e69-a50e-071b9f6b092b [67411564 bytes], 001416.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9c82f81a-33d6-4b8f-a237-2150d0c311d8 [10391374 bytes], 001384.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b591513d-b850-4025-a126-452137b4a6fa [67397014 bytes], 001413.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e2a0408-f9b4-4875-8594-2dc59df3bc66 [5307477 bytes], 001400.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/16bf5cda-4f04-40c9-8b54-ca1c393028f9 [67953551 bytes], 001316.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9543bc5c-dd9d-47b6-9626-159559b9ee45 [67406146 bytes], 001315.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b6099b46-bbef-486a-8c2f-ef232eefdeff [67409984 bytes], 001408.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66354b67-97b2-41e8-ae48-678af553d2a7 [16861835 bytes], 001404.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a550c63b-003c-48d5-9bc2-45341ce4e641 [67413763 bytes], 001406.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/55822a0c-8448-48d2-a0c1-0bbccfaf31ca [67414722 bytes], 001317.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2b1f4dcc-76e2-4492-8470-bd512031a0ab [67407862 bytes], 001401.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/42a7b773-95d6-4a8b-bb18-64f083e9c627 [20184247 bytes], 001414.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/ec3a3d64-c0c1-4a5e-b541-dfc01a0f028d [19636706 bytes], 001385.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2ef14224-f229-4852-b97e-50268979f2a3 [67395048 bytes], 001399.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a4550160-1c0c-4bb7-9eb9-26e2a6d9ed43 [67931628 bytes], 001381.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c6278757-9048-4e9d-a67d-5b9fd46c2c4f [67404982 bytes], 001368.sst=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/4497322c-c46b-4d42-9313-1f36d929c577', dataBytes=1354}, 001407.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3308882e-bf0f-4f26-9f3a-3d3dfb05fc85 [67409775 bytes], 001405.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5d5e1529-b109-483d-afa6-5d44297bec6d [67415504 bytes], 001318.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/7f9a7aa5-db80-4448-b987-1f9fdb88f5d1 [67406909 bytes], 001386.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2cf3a080-37d5-4133-b9d2-83e2ad49fc24 [67395104 bytes], 001411.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66d3176b-2257-403a-be82-03c49162c05f [19605638 bytes], 001410.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/108bb1c1-99da-4920-acab-15f254326ed0 [11733963 bytes], 001387.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9027f480-b895-4336-80b9-dea244ce1572 [67397047 bytes], 001382.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/398f2188-6336-4efe-a014-6bd3bf2ce8c6 [67396271 bytes], 001415.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c73af6f6-18e2-424e-90f9-586809dfbba5 [1931553 bytes], 001402.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/f80462bb-eef1-4e3a-845c-97dea160e306 [67410281 bytes], 001313.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/88251386-e092-4a5e-b8bf-9c2e65148f5a [67408879 bytes]}, privateState={OPTIONS-000013=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b32ad6b5-4540-4b3e-9a82-14cebe231898', dataBytes=17286}, MANIFEST-000004=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c94b6317-b50b-4bf1-938d-83d7aad6ed6a [179781 bytes], CURRENT=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3665ebc2-52ee-4a2e-87b6-6e0cc8458c37', dataBytes=16}}, metaStateHandle=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9af5edc4-c589-4f0b-91d7-515e068a3454', dataBytes=1187}, registered=false}, state size = 1573793130, cost 55319 ms.
      2023-01-18 17:22:08,909 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - Finished restoring from state handle: IncrementalRemoteKeyedStateHandle{backendIdentifier=b3e1d20f-164d-4c5b-aed2-91f5d1224183, stateHandleId=f404ffdb-715e-4f95-a850-f459639a30e6, keyGroupRange=KeyGroupRange{startKeyGroup=96, endKeyGroup=98}, checkpointId=125, sharedState={001388.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/bd7103e0-fe66-4400-a2a2-e4f3dda01b71 [51353250 bytes], 001383.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e937d12-e1d5-4427-9401-80f5db6af2ee [67393725 bytes], 001314.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c2d374dc-88ba-4003-bad5-81590e56963d [67407704 bytes], 001403.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a7ec0f13-3ef1-4e69-a50e-071b9f6b092b [67411564 bytes], 001416.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9c82f81a-33d6-4b8f-a237-2150d0c311d8 [10391374 bytes], 001384.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b591513d-b850-4025-a126-452137b4a6fa [67397014 bytes], 001413.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5e2a0408-f9b4-4875-8594-2dc59df3bc66 [5307477 bytes], 001400.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/16bf5cda-4f04-40c9-8b54-ca1c393028f9 [67953551 bytes], 001316.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9543bc5c-dd9d-47b6-9626-159559b9ee45 [67406146 bytes], 001315.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b6099b46-bbef-486a-8c2f-ef232eefdeff [67409984 bytes], 001408.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66354b67-97b2-41e8-ae48-678af553d2a7 [16861835 bytes], 001404.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a550c63b-003c-48d5-9bc2-45341ce4e641 [67413763 bytes], 001406.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/55822a0c-8448-48d2-a0c1-0bbccfaf31ca [67414722 bytes], 001317.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2b1f4dcc-76e2-4492-8470-bd512031a0ab [67407862 bytes], 001401.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/42a7b773-95d6-4a8b-bb18-64f083e9c627 [20184247 bytes], 001414.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/ec3a3d64-c0c1-4a5e-b541-dfc01a0f028d [19636706 bytes], 001385.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2ef14224-f229-4852-b97e-50268979f2a3 [67395048 bytes], 001399.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/a4550160-1c0c-4bb7-9eb9-26e2a6d9ed43 [67931628 bytes], 001381.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c6278757-9048-4e9d-a67d-5b9fd46c2c4f [67404982 bytes], 001368.sst=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/4497322c-c46b-4d42-9313-1f36d929c577', dataBytes=1354}, 001407.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3308882e-bf0f-4f26-9f3a-3d3dfb05fc85 [67409775 bytes], 001405.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/5d5e1529-b109-483d-afa6-5d44297bec6d [67415504 bytes], 001318.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/7f9a7aa5-db80-4448-b987-1f9fdb88f5d1 [67406909 bytes], 001386.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/2cf3a080-37d5-4133-b9d2-83e2ad49fc24 [67395104 bytes], 001411.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/66d3176b-2257-403a-be82-03c49162c05f [19605638 bytes], 001410.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/108bb1c1-99da-4920-acab-15f254326ed0 [11733963 bytes], 001387.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9027f480-b895-4336-80b9-dea244ce1572 [67397047 bytes], 001382.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/398f2188-6336-4efe-a014-6bd3bf2ce8c6 [67396271 bytes], 001415.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c73af6f6-18e2-424e-90f9-586809dfbba5 [1931553 bytes], 001402.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/f80462bb-eef1-4e3a-845c-97dea160e306 [67410281 bytes], 001313.sst=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/88251386-e092-4a5e-b8bf-9c2e65148f5a [67408879 bytes]}, privateState={OPTIONS-000013=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/b32ad6b5-4540-4b3e-9a82-14cebe231898', dataBytes=17286}, MANIFEST-000004=File State: oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/c94b6317-b50b-4bf1-938d-83d7aad6ed6a [179781 bytes], CURRENT=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/3665ebc2-52ee-4a2e-87b6-6e0cc8458c37', dataBytes=16}}, metaStateHandle=ByteStreamStateHandle{handleName='oss://cluster-serving/flink-jobs/namespaces/state-test-default/deployments/1b1f8910-047f-4e51-a1bc-eea91e57600d/checkpoints/jobs/2daf1d9b-c9ed-40ec-b191-303db813b0de/2daf1d9bc9ed40ecb191303db813b0de/taskowned/9af5edc4-c589-4f0b-91d7-515e068a3454', dataBytes=1187}, registered=false} without rescaling.
      2023-01-18 17:22:08,911 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation [] - restore rocksdb cost 55365 ms.
      2023-01-18 17:22:08,912 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder [] - Finished building RocksDB keyed state-backend at /opt/flink/flink-tmp-dir/tm_job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31/tmp/job_2daf1d9bc9ed40ecb191303db813b0de_op_WindowOperator_0a448493b4782967b150582570326227__37_48__uuid_1404e597-c96c-4d7c-99b7-303fd98f80bd.
      2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] WARN  org.apache.flink.metrics.MetricGroup                         [] - Name collision: Group already contains a Metric with the name 'ChangelogStateBackend.lastFullSizeOfMaterialization'. Metric will not be reported.[192.168.32.162, taskmanager, job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31, Flink Streaming Job, SlidingProcessingTimeWindows, 36]
      2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] WARN  org.apache.flink.metrics.MetricGroup                         [] - Name collision: Group already contains a Metric with the name 'ChangelogStateBackend.lastIncSizeOfMaterialization'. Metric will not be reported.[192.168.32.162, taskmanager, job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31, Flink Streaming Job, SlidingProcessingTimeWindows, 36]
      2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] WARN  org.apache.flink.metrics.MetricGroup                         [] - Name collision: Group already contains a Metric with the name 'ChangelogStateBackend.lastFullSizeOfNonMaterialization'. Metric will not be reported.[192.168.32.162, taskmanager, job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31, Flink Streaming Job, SlidingProcessingTimeWindows, 36]
      2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] WARN  org.apache.flink.metrics.MetricGroup                         [] - Name collision: Group already contains a Metric with the name 'ChangelogStateBackend.lastIncSizeOfNonMaterialization'. Metric will not be reported.[192.168.32.162, taskmanager, job-2daf1d9b-c9ed-40ec-b191-303db813b0de-taskmanager-1-31, Flink Streaming Job, SlidingProcessingTimeWindows, 36]
      2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.changelog.fs.FsStateChangelogStorage        [] - createWriter for operator WindowOperator_0a448493b4782967b150582570326227_(37/48)/KeyGroupRange{startKeyGroup=96, endKeyGroup=98}: 00000000-0000-0000-0000-000000000002
      2023-01-18 17:22:08,915 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation [] - read changelog handle start, total state size=190851072 .
      2023-01-18 17:22:08,919 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.fs.osshadoop.StsFetcherCredentialsProvider  [] - Old credential is going to expire. Fetch a new one.
      2023-01-18 17:22:38,158 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation [] - read read changelog handle end, cost 29243 ms.
      2023-01-18 17:22:38,158 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.state.common.PeriodicMaterializationManager [] - Task SlidingProcessingTimeWindows (37/48)#1 starts periodic materialization
      2023-01-18 17:22:38,158 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.state.common.PeriodicMaterializationManager [] - Task SlidingProcessingTimeWindows (37/48)#1 schedules the next materialization in 82 seconds
      2023-01-18 17:22:38,176 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.runtime.taskmanager.Task                    [] - SlidingProcessingTimeWindows (37/48)#1 #1 (fa12cfa3b811a351e031b036b0e85d91) switched from INITIALIZING to RUNNING.
      2023-01-18 17:22:39,057 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.state.changelog.ChangelogKeyedStateBackend  [] - snapshot of SlidingProcessingTimeWindows (37/48)#1 for checkpoint 11601, change range: 0..2, materialization ID 125
      2023-01-18 17:22:43,779 [Source Data Fetcher for Source: KafkaWindowSource (37/48)#1] INFO  org.apache.kafka.clients.consumer.internals.AbstractCoordinator [] - [Consumer clientId=xr_cl_1-36, groupId=xr_cl_1] Discovered group coordinator 192.168.47.158:9092 (id: 2147483546 rack: null)
      2023-01-18 17:22:44,100 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.state.changelog.ChangelogKeyedStateBackend  [] - snapshot of SlidingProcessingTimeWindows (37/48)#1 for checkpoint 11602, change range: 0..11, materialization ID 125
      2023-01-18 17:22:47,531 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.state.changelog.ChangelogKeyedStateBackend  [] - snapshot of SlidingProcessingTimeWindows (37/48)#1 for checkpoint 11603, change range: 0..17, materialization ID 125
      2023-01-18 17:22:50,837 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.state.changelog.ChangelogKeyedStateBackend  [] - snapshot of SlidingProcessingTimeWindows (37/48)#1 for checkpoint 11604, change range: 0..21, materialization ID 125
      2023-01-18 17:22:53,580 [SlidingProcessingTimeWindows (37/48)#1] INFO  org.apache.flink.state.changelog.ChangelogKeyedStateBackend  [] - snapshot of SlidingProcessingTimeWindows (37/48)#1 for checkpoint 11605, change range: 0..23, materialization ID 125 

      The above log can be simplified to the following scenario:

      - cp1 trigger: file1,file1'(local)
      - JM: register [file1] to sharedRegistry
      - cp1 complete: stopTracking [file1], register [file1'] to localRegistry
      - cp2 trigger: file1,file1',file2,file2'
      - JM: register [file1,file2] to sharedRegistry
      - cp2 complete: stopTracking [file1, file2], register [file1',file2'] to localRegistry
      - cp1 subsume
      - cp3 trigger: file1,file1',file2,file2',file3,file3'
      - materialization: uploaded.clear()
      - JM: register [file1,file2,file3] to sharedRegistry
      - cp3 complete: stopTracking [file3], register [file3] to localRegistry
      - cp2 subsume: [file1', file2'] are discarded
      - if restore from cp3: local file1',file2' are not found

      Attachments

        Issue Links

          Activity

            People

              Yanfei Lei Yanfei Lei
              Yanfei Lei Yanfei Lei
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: