Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.14.0
-
None
-
None
Description
There was a failure that caused a hang due to a FetchEntryMessage reply failing to deserialize. This reply message had already deserialized its reply processor ID, so there's no reason that the waiting thread couldn't have been notified of the error so that it could wake up and process the problem.
[fatal 2020/09/16 14:04:59.318 PDT <P2P message reader for rs-GEM-3059-QQ1128-1a0i32xlarge-hydra-client-36(peergemfire_2_3_host1_14238:14238)<ec><v37>:41011(version:VersionOrdinal[ordinal=125]) shared unordered uid=1 local port=50589 remote port=51132> tid=0x46] Error deserializing message[fatal 2020/09/16 14:04:59.318 PDT <P2P message reader for rs-GEM-3059-QQ1128-1a0i32xlarge-hydra-client-36(peergemfire_2_3_host1_14238:14238)<ec><v37>:41011(version:VersionOrdinal[ordinal=125]) shared unordered uid=1 local port=50589 remote port=51132> tid=0x46] Error deserializing messagejava.io.IOException: Could not create an instance of org.apache.geode.internal.cache.partitioned.FetchEntryMessage$FetchEntryReplyMessage . at org.apache.geode.internal.serialization.internal.DSFIDSerializerImpl.invokeFromData(DSFIDSerializerImpl.java:330) at org.apache.geode.internal.serialization.internal.DSFIDSerializerImpl.create(DSFIDSerializerImpl.java:368) at org.apache.geode.internal.DSFIDFactory.create(DSFIDFactory.java:1024) at org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2387) at org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2401) at org.apache.geode.internal.tcp.Connection.readMessage(Connection.java:2930) at org.apache.geode.internal.tcp.Connection.processInputBuffer(Connection.java:2751) at org.apache.geode.internal.tcp.Connection.readMessages(Connection.java:1619) at org.apache.geode.internal.tcp.Connection.run(Connection.java:1456) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)Caused by: java.io.IOException: Unknown header byte 109 at org.apache.geode.internal.serialization.DscodeHelper.toDSCODE(DscodeHelper.java:40) at org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2494) at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2864) at org.apache.geode.DataSerializer.readLinkedList(DataSerializer.java:2096) at org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2574) at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2864) at org.apache.geode.internal.cache.NonLocalRegionEntry.fromData(NonLocalRegionEntry.java:159) at org.apache.geode.internal.cache.EntrySnapshot.fromData(EntrySnapshot.java:282) at org.apache.geode.internal.cache.EntrySnapshot.<init>(EntrySnapshot.java:253) at org.apache.geode.internal.cache.partitioned.FetchEntryMessage$FetchEntryReplyMessage.fromData(FetchEntryMessage.java:309) at org.apache.geode.internal.serialization.internal.DSFIDSerializerImpl.invokeFromData(DSFIDSerializerImpl.java:317) ... 11 more
Deserialization code for cache operations in Connection.java already handles this sort of thing by using a thread local that is updated by the fromData method of each cache operation. ReplyMessage could do the same thing with a new thread local and Connection.java could check that thread local in the event of a deserialization error.