Details
Description
Dear developers,
I think I have found the reason of some disconnection from the PLC (Siemens 1200 S7) I see on the logs of a software I am writing. Let me explain the scenario.
The software reads a lot of data from the PLC a polling variable and a complete DataBlock I am using a Raspberry like a client an a Siemens S7 1200 PLC.
The reading are usually fine but randomly I have an error in the logs coming from low level Driver.
I am using 0.8.0-SNAPSHOT coming from pool2 experimental feature.
The error I read on the logs is:
2021-01-30 10:27:58 WARN CachedDriverManager - Watchdog detected a long borrowed connection, will be forcefully closed! 2021-01-30 10:27:58 WARN CachedDriverManager - Broken Connection was returned, although it is not borrowed, currently. 2021-01-30 10:27:58 ERROR DbFetcher - Time out Exception in reading DataBlock java.util.concurrent.TimeoutException: null at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886) ~[?:?] at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021) ~[?:?] at it.fox.chronos.plc.observers.DbFetcher.onObservableChanged(DbFetcher.java:92) [chronos-0.0.3.jar:0.0.3] at it.fox.chronos.plc.Poller.notifyObservers(Poller.java:177) [chronos-0.0.3.jar:0.0.3] at it.fox.chronos.plc.Poller.run(Poller.java:105) [chronos-0.0.3.jar:0.0.3] at java.lang.Thread.run(Thread.java:834) [?:?] 2021-01-30 10:27:58 INFO PlcDriverManager - Instantiating new PLC Driver Manager with class loader jdk.internal.loader.ClassLoaders$AppClassLoader@c77c2e 2021-01-30 10:27:58 WARN CachedPlcConnection - Request finished with exception. Reporting Connection as Broken java.util.concurrent.CancellationException: null at java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2396) ~[?:?] at org.apache.plc4x.java.utils.connectionpool2.CachedPlcConnection.lambda$wrapWithTimeout$0(CachedPlcConnection.java:84) ~[plc4j-connection-cache-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] 2021-01-30 10:27:58 INFO PlcDriverManager - Registering available drivers... 2021-01-30 10:27:58 WARN CachedDriverManager - Broken Connection was returned, although it is not borrowed, currently. 2021-01-30 10:27:58 INFO PlcDriverManager - Registering driver for Protocol s7 (Siemens S7 (Basic)) 2021-01-30 10:27:58 INFO TcpChannelFactory - Configuring Bootstrap with Configuration{local-rack=1, local-slot=1, remote-rack=0, remot-slot=0, pduSize=1024, maxAmqCaller=8, maxAmqCallee=8, controllerType='null'} 2021-01-30 10:27:58 INFO S7ProtocolLogic - S7 Driver running in ACTIVE mode.
At the first sight it seems that a connection became stale and the watchdog drop it but I think this is just the effect of an error on the wire.
Here is the screenshot of the capture on the wire (I attach the capture to the ticket too).
At 2021-01-30 10:27:53.899233 there's a request for 7 variables in the DB2:
Frame 9: 157 bytes on wire (1256 bits), 157 bytes captured (1256 bits) Ethernet II, Src: KUNBUS_01:5e:53 (c8:3e:a7:01:5e:53), Dst: SiemensN_02:19:4c (00:1c:06:02:19:4c) Internet Protocol Version 4, Src: 192.168.1.190, Dst: 192.168.1.192 Transmission Control Protocol, Src Port: 36240, Dst Port: 102, Seq: 94, Ack: 736, Len: 103 TPKT, Version: 3, Length: 103 ISO 8073/X.224 COTP Connection-Oriented Transport Protocol S7 Communication Header: (Job) Parameter: (Read Var) Function: Read Var (0x04) Item count: 7 Item [1]: (DB 2.DBX 3238.0 INT 47) Item [2]: (DB 2.DBX 7348.0 BIT 1) Item [3]: (DB 2.DBX 24.0 DINT 1) Item [4]: (DB 2.DBX 7366.0 DINT 1) Item [5]: (DB 2.DBX 66.0 REAL 1) Item [6]: (DB 2.DBX 86.0 INT 5) Item [7]: (DB 2.DBX 74.0 REAL 1)
Without waiting the response, immediately after this request, there's an another request for an another variable, a single REAL again from the same DB2:
Frame 10: 85 bytes on wire (680 bits), 85 bytes captured (680 bits) Ethernet II, Src: KUNBUS_01:5e:53 (c8:3e:a7:01:5e:53), Dst: SiemensN_02:19:4c (00:1c:06:02:19:4c) Internet Protocol Version 4, Src: 192.168.1.190, Dst: 192.168.1.192 Transmission Control Protocol, Src Port: 36240, Dst Port: 102, Seq: 197, Ack: 736, Len: 31 TPKT, Version: 3, Length: 31 ISO 8073/X.224 COTP Connection-Oriented Transport Protocol S7 Communication Header: (Job) Parameter: (Read Var) Function: Read Var (0x04) Item count: 1 Item [1]: (DB 2.DBX 4938.0 REAL 24)
I think this is the root cause of the error and the real reason because the watchdog stand up and drop the connection.
The PLC solicited by this double request send a couple of RST to the library and the connection dye.
I have many of this disconnections, more or less one every 15 minutes. All the good request has a query and an answer, all the double request receive a RST from the PLC.
My code perform only synchronous request in this way:
PlcReadResponse response = readRequest.execute().get(5000, TimeUnit.MILLISECONDS);
I think the double request came from the pool2 or the driver (sorry I am not able to understand that).
Do you think is this a bug? Is it something fixable ?
Regards,
Stefano Bossi