Details
-
Improvement
-
Status: Open
-
Not a Priority
-
Resolution: Unresolved
-
1.7.0
-
None
Description
As reported on the user mailing list thread "`env.java.opts` not persisting after job canceled or failed and then restarted", there can be issues with using native libraries and user code class loading.
Steps to reproduce
I was able to reproduce the issue reported on the mailing list using snappy-java in a user program. Running the attached user program works fine on initial submission, but results in a failure when re-executed.
I'm using Flink 1.7.0 using a standalone cluster started via bin/start-cluster.sh.
0. Unpack attached Maven project and build using mvn clean package or directly use attached hello-snappy-1.0-SNAPSHOT.jar
1. Download snappy-java-1.1.7.2.jar and unpack libsnappyjava for your system:
jar tf snappy-java-1.1.7.2.jar | grep libsnappy ... org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so ... org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib ...
2. Configure system library path to libsnappyjava in flink-conf.yaml (path needs to be adjusted for your system):
env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
3. Run attached hello-snappy-1.0-SNAPSHOT.jar
bin/flink run hello-snappy-1.0-SNAPSHOT.jar
Starting execution of program
Program execution finished
Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
Job Runtime: 359 ms
4. Rerun attached hello-snappy-1.0-SNAPSHOT.jar
bin/flink run hello-snappy-1.0-SNAPSHOT.jar Starting execution of program ------------------------------------------------------------ The program finished with the following exception: org.apache.flink.client.program.ProgramInvocationException: Job failed. (JobID: 7d69baca58f33180cb9251449ddcd396) at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487) at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) at com.github.uce.HelloSnappy.main(HelloSnappy.java:18) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813) at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050) at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126) at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126) Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146) at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265) ... 17 more Caused by: java.lang.UnsatisfiedLinkError: Native Library /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded in another classloader at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1907) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1861) at java.lang.Runtime.loadLibrary0(Runtime.java:870) at java.lang.System.loadLibrary(System.java:1122) at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:182) at org.xerial.snappy.SnappyLoader.loadSnappyApi(SnappyLoader.java:154) at org.xerial.snappy.Snappy.<clinit>(Snappy.java:47) at com.github.uce.HelloSnappy.lambda$main$95f17bfa$1(HelloSnappy.java:13) at org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554) at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718) at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696) at org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104) at org.apache.flink.streaming.api.functions.source.FromElementsFunction.run(FromElementsFunction.java:164) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:94) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:58) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704) at java.lang.Thread.run(Thread.java:748)
Note: The attached user code configures Snappy to use libsnappyjava in the path specified by java.library.path (see org-xerial-snappy.properties). When bundling the native code in the user JAR, repeated execution works fine.
Attachments
Attachments
Issue Links
- relates to
-
FLINK-5408 RocksDB initialization can fail with an UnsatisfiedLinkError in the presence of multiple classloaders
- Closed