Description
I was trying to activate SSL within Mesos but had rendered an invalid certificate, it was signed with a mismatching key. Once I started the master, the error message I received was rather confusing to me:
W0503 10:15:58.027343 6696 openssl.cpp:363] Failed SSL connections will be downgraded to a non-SSL socket Could not load key file
To me, this error message hinted that the key file was not existing or had rights issues. However, a quick strace revealed that the key-file was properly accessed, no sign of a file-not-found or alike.
The problem here is the hardcoded error-message, not taking OpenSSL's human readable error strings into account.
The code that misguided me is located at https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/openssl.cpp#L471
We might want to change
// Set private key. if (SSL_CTX_use_PrivateKey_file( ctx, ssl_flags->key_file.get().c_str(), SSL_FILETYPE_PEM) != 1) { EXIT(EXIT_FAILURE) << "Could not load key file"; }
Towards something like this
// Set private key. if (SSL_CTX_use_PrivateKey_file( ctx, ssl_flags->key_file.get().c_str(), SSL_FILETYPE_PEM) != 1) { EXIT(EXIT_FAILURE) << "Could not use key file: " << ERR_error_string(ERR_get_error(), NULL); }
To receive a much more helpful message like this
W0503 13:18:12.551364 11572 openssl.cpp:363] Failed SSL connections will be downgraded to a non-SSL socket Could not use key file: error:0B080074:x509 certificate routines:X509_check_private_key:key values mismatch
A quick scan of the implementation within openssl.cpp to me suggests that there are more places that we might want to update with more deterministic error messages.