Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.7.0, 1.7.1
Description
When TM loads a offloaded TDD with large size, it may throw a "java.lang.OutOfMemoryError: Direct Buffer Memory" error. The loading uses nio's Files.readAllBytes() to read serialized TDD. In the call stack of Files.readAllBytes() , it will allocate a direct memory buffer which's size is equal the length of the file. This will cause OutOfMemoryErro error when direct memory is not enough.
If the length of a file is large than a maximum buffer size, the maximum size direct-buffer should be used to read bytes of the file to avoid direct memory OutOfMemoryError. The maximum buffer size can be 8K or others.
The exception stack is as follows (this exception stack is from an old Flink version, but the master branch has the same problem).
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:706)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:241)
at sun.nio.ch.IOUtil.read(IOUtil.java:195)
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:182)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.nio.file.Files.read(Files.java:3105)
at java.nio.file.Files.readAllBytes(Files.java:3158)
at org.apache.flink.runtime.deployment.TaskDeploymentDescriptor.loadBigData(TaskDeploymentDescriptor.java:338)
at org.apache.flink.runtime.taskexecutor.TaskExecutor.submitTask(TaskExecutor.java:397)
at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:211)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:155)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:133)
at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
... 9 more
Attachments
Issue Links
- links to