Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.9, 0.10.0
-
None
-
0.9-milestone-1
Exception on the cluster, local execution works
Description
The delta-iteration program in [1] ends with an
java.io.EOFException
at org.apache.flink.runtime.operators.hash.InMemoryPartition$WriteView.nextSegment(InMemoryPartition.java:333)
at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.advance(AbstractPagedOutputView.java:140)
at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.writeByte(AbstractPagedOutputView.java:223)
at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.writeLong(AbstractPagedOutputView.java:291)
at org.apache.flink.runtime.memorymanager.AbstractPagedOutputView.writeDouble(AbstractPagedOutputView.java:307)
at org.apache.flink.api.common.typeutils.base.DoubleSerializer.serialize(DoubleSerializer.java:62)
at org.apache.flink.api.common.typeutils.base.DoubleSerializer.serialize(DoubleSerializer.java:26)
at org.apache.flink.api.scala.typeutils.CaseClassSerializer.serialize(CaseClassSerializer.scala:89)
at org.apache.flink.api.scala.typeutils.CaseClassSerializer.serialize(CaseClassSerializer.scala:29)
at org.apache.flink.runtime.operators.hash.InMemoryPartition.appendRecord(InMemoryPartition.java:219)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:536)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:347)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:209)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:270)
at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:745)
For logs and the accompanying mailing list discussion see below.
When running with slightly different memory configuration, as hinted on the mailing list, I sometimes also get this exception:
19.Apr. 13:39:29 INFO Task - IterationHead(WorksetIteration (Resolved-Redirects)) (10/10) switched to FAILED : java.lang.IndexOutOfBoundsException: Index: 161, Size: 161
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.flink.runtime.operators.hash.InMemoryPartition$WriteView.resetTo(InMemoryPartition.java:352)
at org.apache.flink.runtime.operators.hash.InMemoryPartition$WriteView.access$100(InMemoryPartition.java:301)
at org.apache.flink.runtime.operators.hash.InMemoryPartition.appendRecord(InMemoryPartition.java:226)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:536)
at org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:347)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:209)
at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:270)
at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:745)
[1] Flink job: https://gist.github.com/knub/0bfec859a563009c1d57
[2] Job manager logs: https://gist.github.com/knub/01e3a4b0edb8cde66ff4
[3] One task manager's logs: https://gist.github.com/knub/8f2f953da95c8d7adefc
[4] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/EOFException-when-running-Flink-job-td1092.html