Uploaded image for project: 'Camel'
  1. Camel
  2. CAMEL-12830

FTP producer stuck if timeout occurs just after connect

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.22.1
    • 2.22.2, 2.23.0
    • camel-ftp
    • None
    • Patch Available
    • Unknown

    Description

      In our production systems, we had several threads stuck indefinitely while trying to send a file to an FtpEndpoint. We had both connectTimeout and soTimeout properties set so it surprised us a little bit.

      After digging a bit, we found that the scenario is quite simple to reproduce: this happens every time the FTPClient establishes the TCP connection with a server that does not respond anything.

      Here is a simplified view of what happens when establishing a connection using a FTPClient:

      // within Socket Client
      public void connect(InetAddress host, int port) throws SocketException,IOException {
          _socket_.connect(new InetSocketAddress(host, port), connectTimeout);
          _connectAction_();
      }
      protected void _connectAction_() throws IOException { 
          _socket_.setSoTimeout(_timeout_); // _timeout_ is the default timeout of the socket
      }
      
      // overridden within FTP
      protected void _connectAction_() {
          super._connectAction_();
          if (connectTimeout > 0) {
              int original = _socket_.getSoTimeout();
              _socket_.setSoTimeout(connectTimeout);
              try {
                  __getReply();
              } finally {
                   _socket_.setSoTimeout(original);
              }
          }
      }

       A SocketTimeoutException can be thrown either during the initial socket connect action, either during the __getReply() where the FTPClient waits for the hello message from the server.  Both are using connectTimeout, after which the original (default) timeout is restored. The soTimeout we specified in the URI is configured by FTPOperations only when the connection is successful. In this case, the Socket is connected, but an exception is thrown afterwards and the soTimeout is left at 0. 

      Within Camel, when the RemoteFileProducer encounters an exception while processing an Exchange, it tries to disconnect the endpoint properly with a logout followed by a disconnect

      // RemoteFileProducer
      public void handleFailedWrite(Exchange exchange, Exception exception) throws Exception {
          try {
              if (getOperations().isConnected()) { // <== in our case, this returns true because the socket is actually connected
                  getOperations().disconnect();
              }
          } catch (...) {
              ...
          }
      }
      // FTPOperations
      protected void doDisconnect() throws GenericFileOperationFailedException {
          try {
              client.logout();
          } catch (IOException e) {
               throw new GenericFileOperationFailedException
          } finally {
              try {
                  client.disconnect();
              } catch (IOException e) {
                  throw new GenericFileOperationFailedException
              }
          }
      }

      Unfortunately, at this point, the client.logout() sends the FTP QUIT command, then waits for the response still using the default timeout of the Socket. Since the misbehaving server/firewall never sent any form of response, the thread is left waiting forever.

      I attached a simple test case to illustrate the scenario, simply using a ServerSocket that never accepts any connection. Also included is an easy workaround that uses a custom FTPClient on which the default timeout is set. 

      A possible fix would be to always set the default timeout on the socket, before connecting it.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            acosentino Andrea Cosentino
            lchdev Laurent Chiarello
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment