Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2739

Catalog tries to talk to the statestore using SSL before it enables SSL

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Not A Bug
    • Impala 2.3.0
    • None
    • None

    Description

      When enabling Daemon-Daemon SSL with Kerberos, I noticed that the impalad is able to connect to the statestore without any problems, however, the catalog always failed in authenticating the statestore with the error message:

      Couldn't open transport for e1320.halxg.cloudera.com:24000 (SSL_get_verify_result(), certificate signature failure)
      

      This caused the hang mentioned in IMPALA-2598.

      Our current theory is that the Catalog server tries to subscribe to the Statestore before it calls EnableSsl()(done in catalogd-main.cc) and sets up it's SSL Server socket (which is done in server->Start() where 'server' is of type 'ThriftServer').

      Theoretically, since Impala does only one-way authentication in SSL/TLS, this shouldn't matter (because the Catalog connects to the Statestore as a client). However, moving the statestore subscribe code to after the SSL Server Socket is set up seems to have fixed the problem. This might be because of some internal openSSL state that's set. I'm still looking into that.

      int main(int argc, char** argv) {
      
       ...
      
        EXIT_IF_ERROR(catalog_server.Start()); // Currently we try to subscribe to the statestore from here.
      
       ...
      
        ThriftServer* server = new ThriftServer("CatalogService", processor,
            FLAGS_catalog_service_port, NULL, metrics.get(), 5);
        if (EnableInternalSslConnections()) {
          LOG(INFO) << "Enabling SSL for CatalogService";
          EXIT_IF_ERROR(server->EnableSsl(FLAGS_ssl_server_certificate, FLAGS_ssl_private_key,
              FLAGS_ssl_private_key_password_cmd));
        }
        EXIT_IF_ERROR(server->Start()); //This is where we setup the SSL Server Socket
      
       ...
      
      }
      

      The fix currently just involves moving catalog_server.Start() after server->Start().

      Attachments

        Activity

          People

            sailesh Sailesh Mukil
            sailesh Sailesh Mukil
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: