In our production systems, we had several threads stuck indefinitely while trying to send a file to an FtpEndpoint. We had both connectTimeout and soTimeout properties set so it surprised us a little bit.
After digging a bit, we found that the scenario is quite simple to reproduce: this happens every time the FTPClient establishes the TCP connection with a server that does not respond anything.
Here is a simplified view of what happens when establishing a connection using a FTPClient:
A SocketTimeoutException can be thrown either during the initial socket connect action, either during the __getReply() where the FTPClient waits for the hello message from the server. Both are using connectTimeout, after which the original (default) timeout is restored. The soTimeout we specified in the URI is configured by FTPOperations only when the connection is successful. In this case, the Socket is connected, but an exception is thrown afterwards and the soTimeout is left at 0.
Within Camel, when the RemoteFileProducer encounters an exception while processing an Exchange, it tries to disconnect the endpoint properly with a logout followed by a disconnect.
Unfortunately, at this point, the client.logout() sends the FTP QUIT command, then waits for the response still using the default timeout of the Socket. Since the misbehaving server/firewall never sent any form of response, the thread is left waiting forever.
I attached a simple test case to illustrate the scenario, simply using a ServerSocket that never accepts any connection. Also included is an easy workaround that uses a custom FTPClient on which the default timeout is set.
A possible fix would be to always set the default timeout on the socket, before connecting it.