Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-22754

Finished job not always archived properly

    XMLWordPrintableJSON

Details

    Description

      Flink batch job, ran on YARN/EMR. Flink v1.12.1

      2 different runs, same configuration:

      jobmanager.archive.fs.dir=s3://REDACTED-emr/history-server

       

      At the end of the first run, the job has been properly archived:

      2021-05-20 07:14:39,479 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Shut down cluster because application is in SUCCEEDED, diagnostics null.
      2021-05-20 07:14:39,479 INFO org.apache.flink.yarn.YarnResourceManagerDriver [] - Unregister application from the YARN Resource Manager with final status SUCCEEDED.
      2021-05-20 07:14:39,490 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl [] - Waiting for application to be successfully unregistered.
      2021-05-20 07:14:39,501 WARN org.apache.hadoop.util.NativeCodeLoader [] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      2021-05-20 07:14:40,277 INFO org.apache.hadoop.conf.Configuration.deprecation [] - fs.s3a.server-side-encryption-key is deprecated. Instead, use fs.s3a.server-side-encryption.key
      2021-05-20 07:14:40,531 WARN com.amazonaws.services.s3.internal.Mimetypes [] - Unable to find 'mime.types' file in classpath
      2021-05-20 07:14:40,727 INFO org.apache.flink.runtime.entrypoint.component.DispatcherResourceManagerComponent [] - Closing components.
      2021-05-20 07:14:40,728 INFO org.apache.flink.runtime.dispatcher.runner.JobDispatcherLeaderProcess [] - Stopping JobDispatcherLeaderProcess.
      2021-05-20 07:14:40,728 INFO org.apache.flink.runtime.dispatcher.MiniDispatcher [] - Stopping dispatcher akka.tcp://flink@ip-172-31-9-230.eu-central-1.compute.internal:44219/user/rpc/dispatcher_1.
      2021-05-20 07:14:40,728 INFO org.apache.flink.runtime.dispatcher.MiniDispatcher [] - Stopping all currently running jobs of dispatcher akka.tcp://flink@ip-172-31-9-230.eu-central-1.compute.internal:44219/user/rpc/dispatcher_1.
      2021-05-20 07:14:40,729 INFO org.apache.flink.runtime.rest.handler.legacy.backpressure.BackPressureRequestCoordinator [] - Shutting down back pressure request coordinator.
      2021-05-20 07:14:40,729 INFO org.apache.flink.runtime.dispatcher.MiniDispatcher [] - Stopped dispatcher akka.tcp://flink@ip-172-31-9-230.eu-central-1.compute.internal:44219/user/rpc/dispatcher_1.
      2021-05-20 07:14:40,815 INFO org.apache.flink.runtime.history.FsJobArchivist [] - Job a2760da3fb41127b8d6f54a1573bd3fa has been archived at s3://REDACTED-emr/history-server/a2760da3fb41127b8d6f54a1573bd3fa.

       

      At the end of the second run, no mention of "FsJobArchivist". No data pushed to s3.

      2021-05-21 23:49:00,191 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Shut down cluster because application is in SUCCEEDED, diagnostics null.
      2021-05-21 23:49:00,193 INFO org.apache.flink.yarn.YarnResourceManagerDriver [] - Unregister application from the YARN Resource Manager with final status SUCCEEDED.
      2021-05-21 23:49:00,208 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl [] - Waiting for application to be successfully unregistered.
      2021-05-21 23:49:01,311 INFO org.apache.flink.runtime.entrypoint.component.DispatcherResourceManagerComponent [] - Closing components.
      2021-05-21 23:49:01,312 INFO org.apache.flink.runtime.dispatcher.runner.JobDispatcherLeaderProcess [] - Stopping JobDispatcherLeaderProcess.
      2021-05-21 23:49:01,312 INFO org.apache.flink.runtime.dispatcher.MiniDispatcher [] - Stopping dispatcher akka.tcp://flink@ip-172-31-34-178.eu-central-1.compute.internal:35331/user/rpc/dispatcher_1.
      2021-05-21 23:49:01,312 INFO org.apache.flink.runtime.dispatcher.MiniDispatcher [] - Stopping all currently running jobs of dispatcher akka.tcp://flink@ip-172-31-34-178.eu-central-1.compute.internal:35331/user/rpc/dispatcher_1.
      2021-05-21 23:49:01,312 INFO org.apache.flink.runtime.rest.handler.legacy.backpressure.BackPressureRequestCoordinator [] - Shutting down back pressure request coordinator.
      2021-05-21 23:49:01,312 INFO org.apache.flink.runtime.dispatcher.MiniDispatcher [] - Stopped dispatcher akka.tcp://flink@ip-172-31-34-178.eu-central-1.compute.internal:35331/user/rpc/dispatcher_1.
      2021-05-21 23:49:01,337 WARN com.amazonaws.services.s3.internal.Mimetypes [] - Unable to find 'mime.types' file in classpath
      2021-05-21 23:49:01,460 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl [] - Closing the SlotManager.
      2021-05-21 23:49:01,460 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl [] - Suspending the SlotManager.
      2021-05-21 23:49:01,462 INFO org.apache.flink.runtime.blob.BlobServer [] - Stopped BLOB server at 0.0.0.0:40643
      2021-05-21 23:49:01,464 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Stopping Akka RPC service.
      2021-05-21 23:49:01,487 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Stopping Akka RPC service.

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            mathieude Mathieu DESPRIEE
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: