Uploaded image for project: 'MINA SSHD'
  1. MINA SSHD
  2. SSHD-1197

Race condition in KEX

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.7.0
    • 2.8.0
    • None

    Description

      There is a race condition in the KEX implementation. A simple reproducer can be obtained by modifying SftpTransferTest by inserting the following in method doTestTransferIntegrity() just before the main try-finally:

      CoreModuleProperties.REKEY_BLOCKS_LIMIT.set(client, Long.valueOf(65536));
      CoreModuleProperties.REKEY_BLOCKS_LIMIT.set(sshd, Long.valueOf(65536));
      try (ClientSession session = createAuthenticatedClientSession();
          ...
      

      This forces rekeying every 512kB; the test roundtrips a 10Mb file twice, which gives ample opportunity to run into this race condition. Typically the test fails very quickly and hangs.

      The hang is always caused by an async write not being cancelled when the session gets disconnected. The session disconnects during KEX because KEX state is corrupted because of the race condition.

      Most of the time the race condition causes a signature verification failure during KEX, but I also got "Disconnecting(ClientSessionImpl[testTransferIntegrity@/127.0.0.1:62186]): SSH2_DISCONNECT_KEY_EXCHANGE_FAILED - Unable to negotiate key exchange for mac algorithms (server to client) (client: null / server: hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1)", which is easier to understand than the signature verification failure.

      The precise sequence of events in that case is

          Server                                                    Client
      
      1.  thread-nio-4 requestNewKeyExchange DONE->INIT
      2.  thread-nio-4 sendKexInit
      3.                                                            main            requestNewKeyExchange DONE->INIT
      4.                                                            thread-nio-3    handleKexInit
      5.                                                            thread-nio-3      doKexNegotiation INIT->RUN
      6.                                                            thread-nio-3        negotiate -> Exception: client proposal null
      7.                                                            main            sendKexInit
      8.  thread-nio-2 receive KEX_INIT      INIT->RUN
      9.                                                            thread-nio-3    Exception caught
      10.                                                           thread-nio-3    Disconnecting
      11. thread-nio-5 process SSH_MSG_DISCONNECT (KexState RUN)
      

      There is window between steps 3 and 7 in AbstractSession.requestNewKeyExchange() during which the KEX state is INIT, but the client proposal isn't initialized yet; it's initialized only after sendKexInit() has been done. However, the client already got the server's KEX_INIT message, and in doKexNegotiation() proceeds as if the client proposal was already set up.

      So, theres' two problems here:

      1. KEX fails due to a race condition.
      2. The client hangs after disconnecting because an async write future is not terminated.

      Attachments

        Issue Links

          Activity

            People

              twolf Thomas Wolf
              twolf Thomas Wolf
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 40m
                  2h 40m