Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5468

Restoring from a semi async rocksdb statebackend (1.1) to 1.2 fails with ClassNotFoundException

    Details

      Description

      I think we should catch this exception and explain what's going on and how users can resolve the issue.

      org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed
      	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
      	at org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:210)
      	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:400)
      	at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
      	at com.dataartisans.eventwindow.Generator.main(Generator.java:60)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
      	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
      	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:339)
      	at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:831)
      	at org.apache.flink.client.CliFrontend.run(CliFrontend.java:256)
      	at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1073)
      	at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
      	at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1117)
      	at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
      	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
      	at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1116)
      Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed
      	at org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:328)
      	at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:382)
      	at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:423)
      	... 22 more
      Caused by: java.io.IOException: java.lang.ClassNotFoundException: org.apache.flink.contrib.streaming.state.RocksDBStateBackend$FinalSemiAsyncSnapshot
      	at org.apache.flink.migration.runtime.checkpoint.savepoint.SavepointV0Serializer.deserialize(SavepointV0Serializer.java:162)
      	at org.apache.flink.migration.runtime.checkpoint.savepoint.SavepointV0Serializer.deserialize(SavepointV0Serializer.java:70)
      	at org.apache.flink.runtime.checkpoint.savepoint.SavepointStore.loadSavepoint(SavepointStore.java:138)
      	at org.apache.flink.runtime.checkpoint.savepoint.SavepointLoader.loadAndValidateSavepoint(SavepointLoader.java:64)
      	at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1348)
      	at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1330)
      	at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1330)
      	at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
      	at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
      	at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
      	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
      	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      Caused by: java.lang.ClassNotFoundException: org.apache.flink.contrib.streaming.state.RocksDBStateBackend$FinalSemiAsyncSnapshot
      	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
      	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
      	at java.lang.Class.forName0(Native Method)
      	at java.lang.Class.forName(Class.java:270)
      	at org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:66)
      	at org.apache.flink.migration.util.MigrationInstantiationUtil$ClassLoaderObjectInputStream.resolveClass(MigrationInstantiationUtil.java:66)
      	at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
      	at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at java.util.HashMap.readObject(HashMap.java:1180)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
      	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
      	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
      	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
      	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
      	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
      	at org.apache.flink.migration.util.MigrationInstantiationUtil.deserializeObject(MigrationInstantiationUtil.java:79)
      	at org.apache.flink.migration.util.MigrationInstantiationUtil.deserializeObject(MigrationInstantiationUtil.java:71)
      	at org.apache.flink.migration.util.SerializedValue.deserializeValue(SerializedValue.java:56)
      	at org.apache.flink.migration.runtime.checkpoint.savepoint.SavepointV0Serializer.determineOperatorChainLength(SavepointV0Serializer.java:327)
      	at org.apache.flink.migration.runtime.checkpoint.savepoint.SavepointV0Serializer.convertTaskState(SavepointV0Serializer.java:187)
      	at org.apache.flink.migration.runtime.checkpoint.savepoint.SavepointV0Serializer.convertSavepoint(SavepointV0Serializer.java:174)
      	at org.apache.flink.migration.runtime.checkpoint.savepoint.SavepointV0Serializer.deserialize(SavepointV0Serializer.java:160)
      	... 14 more
      

        Issue Links

          Activity

          Hide
          srichter Stefan Richter added a comment -

          Migrating from semi async snapshotting will not be supported in Flink 1.2. This is a known issue that was also communicated in release notes. I agree, the error message could be improved here.

          Show
          srichter Stefan Richter added a comment - Migrating from semi async snapshotting will not be supported in Flink 1.2. This is a known issue that was also communicated in release notes. I agree, the error message could be improved here.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user StefanRRichter opened a pull request:

          https://github.com/apache/flink/pull/3119

          FLINK-5468 Improved error message for migrating semi async snapshot

          This PR addresses FLINK-5468.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/StefanRRichter/flink FLINK-5468-restoring-from-semi-async

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/3119.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #3119


          commit 537166e1ff510d4f84783b22f8cc6f8f0ee2f753
          Author: Stefan Richter <s.richter@data-artisans.com>
          Date: 2017-01-13T14:19:37Z

          FLINK-5468 Improved error message for migrating semi async RocksDB snapshot


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user StefanRRichter opened a pull request: https://github.com/apache/flink/pull/3119 FLINK-5468 Improved error message for migrating semi async snapshot This PR addresses FLINK-5468 . You can merge this pull request into a Git repository by running: $ git pull https://github.com/StefanRRichter/flink FLINK-5468 -restoring-from-semi-async Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3119.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3119 commit 537166e1ff510d4f84783b22f8cc6f8f0ee2f753 Author: Stefan Richter <s.richter@data-artisans.com> Date: 2017-01-13T14:19:37Z FLINK-5468 Improved error message for migrating semi async RocksDB snapshot
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StefanRRichter commented on the issue:

          https://github.com/apache/flink/pull/3119

          cc @rmetzger

          Show
          githubbot ASF GitHub Bot added a comment - Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/3119 cc @rmetzger
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/3119

          Good fix!

          I was wondering if we can make the "separation of concerns" a bit nicer if we keep this special case check out of the Savepoint Serializer, but instead add the "SemiAsyncSnapshot" class to the migration package and throw this exception in a static initializer of that class.
          Do you think that would work?

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3119 Good fix! I was wondering if we can make the "separation of concerns" a bit nicer if we keep this special case check out of the Savepoint Serializer, but instead add the "SemiAsyncSnapshot" class to the migration package and throw this exception in a static initializer of that class. Do you think that would work?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StefanRRichter commented on the issue:

          https://github.com/apache/flink/pull/3119

          I changed the PR like @StephanEwen suggested.

          Show
          githubbot ASF GitHub Bot added a comment - Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/3119 I changed the PR like @StephanEwen suggested.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/3119

          Looks good to me, +1 to merge it

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3119 Looks good to me, +1 to merge it
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user uce commented on the issue:

          https://github.com/apache/flink/pull/3119

          Thanks for the PR. Looks good to me. I'm going to merge this.

          Show
          githubbot ASF GitHub Bot added a comment - Github user uce commented on the issue: https://github.com/apache/flink/pull/3119 Thanks for the PR. Looks good to me. I'm going to merge this.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/flink/pull/3119

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/3119
          Hide
          uce Ufuk Celebi added a comment -

          Fixed in d431373 (release-1.2), 988729e (master).

          Show
          uce Ufuk Celebi added a comment - Fixed in d431373 (release-1.2), 988729e (master).

            People

            • Assignee:
              srichter Stefan Richter
              Reporter:
              rmetzger Robert Metzger
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development