Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-16146

Hive View Synchronized Around Entire Connection Creation Causing Deadlock

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.4.0
    • 2.4.0
    • None
    • None

    Description

      Hive View Synchronized Around Entire Connection Creation Causing Deadlock

      The Hive view uses two synchronized methods when creating connections:

      ConnectionFactory

        public synchronized HdfsApi getHDFSApi() {
          if (hdfsApi == null) {
            try {
              hdfsApi = HdfsUtil.connectToHDFSApi(context);
            } catch (Exception ex) {
              throw new ServiceFormattedException("HdfsApi connection failed. Check \"webhdfs.url\" property", ex);
            }
          }
          return hdfsApi;
        }
      

      Connection

        public synchronized void openConnection() throws HiveClientException, HiveAuthRequiredException {
          try {
            transport = isHttpTransportMode() ? createHttpTransport() : createBinaryTransport();
            transport.open();
            client = new TCLIService.Client(new TBinaryProtocol(transport));
          } catch (TTransportException e) {
            throw new HiveClientException("H020 Could not establish connection to "
                + host + ":" + port + ": " + e.toString(), e);
          } catch (SQLException e) {
            throw new HiveClientException(e.getMessage(), e);
          }
          LOG.info("Hive connection opened");
        }
      

      UserLocationConnection

        @Override
        protected synchronized Connection initialValue(ViewContext context) {
          ConnectionFactory hiveConnectionFactory = new ConnectionFactory(context, authCredentialsLocal.get(context));
          authCredentialsLocal.remove(context);  // we should not store credentials in memory,
                                                // password is erased after connection established
          return hiveConnectionFactory.create();
        }
      

      The problem with this approach is that views must share the Jetty thread pool with the Ambari Server. When the Hive view is requested, several threads are spawned and each waits for a single connection to Hive. One thread enters the synchronized block and attempts to make the connections. All other threads are blocked - and that means that Ambari's Jetty threads are not blocked as well and not able to answer requests.

      Between opening connections to HDFS, Ambari, and Hive, these calls can easily take between several seconds to a minute to complete. During that time, no other requests can be fulfilled by Ambari on those threads. If there are several users using Ambari, then this means that all available Jetty threads are going to be waiting for the sole hive thread to complete it's synchronized block.

      This essentially makes Ambari single-threaded

      AMBARI-16131 is a workaround to alleviate this problem by denying access to the view if there are already too many threads being held by various views.

      However, this problem also needs to be fixed in the Hive view. Using a new workflow of callbacks and/or asynchronous returns/polling while waiting for the connection, you can prevent the use of these synchronized blocks.

      Here's an example of a thread dump showing the problem:

      This thread is stuck inside of the synchronized trying to make a connection back to Ambari:

      "qtp-ambari-client-117" prio=10 tid=0x00007efbbc029800 nid=0x135e runnable [0x00007efb929e5000]
         java.lang.Thread.State: RUNNABLE
      	at java.net.SocketInputStream.socketRead0(Native Method)
      	at java.net.SocketInputStream.read(SocketInputStream.java:152)
      	at java.net.SocketInputStream.read(SocketInputStream.java:122)
      	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
      	at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
      	at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
      	- locked <0x000000077769e870> (a java.io.BufferedInputStream)
      	at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:690)
      	at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
      	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1325)
      	- locked <0x0000000777692ff8> (a sun.net.www.protocol.http.HttpURLConnection)
      	at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
      	at org.apache.ambari.server.controller.internal.URLStreamProvider.processURL(URLStreamProvider.java:209)
      	at org.apache.ambari.server.view.ViewAmbariStreamProvider.getInputStream(ViewAmbariStreamProvider.java:118)
      	at org.apache.ambari.server.view.ViewAmbariStreamProvider.readFrom(ViewAmbariStreamProvider.java:78)
      	at org.apache.ambari.view.utils.ambari.URLStreamProviderBasicAuth.readFrom(URLStreamProviderBasicAuth.java:65)
      	at org.apache.ambari.view.utils.ambari.AmbariApi.requestClusterAPI(AmbariApi.java:173)
      	at org.apache.ambari.view.utils.ambari.AmbariApi.requestClusterAPI(AmbariApi.java:142)
      	at org.apache.ambari.view.utils.ambari.AmbariApi.getHostsWithComponent(AmbariApi.java:99)
      	at org.apache.ambari.view.hive.client.ConnectionFactory.getHiveHost(ConnectionFactory.java:79)
      	at org.apache.ambari.view.hive.client.ConnectionFactory.create(ConnectionFactory.java:68)
      	at org.apache.ambari.view.hive.client.UserLocalConnection.initialValue(UserLocalConnection.java:42)
      	- locked <0x0000000798772aa8> (a org.apache.ambari.view.hive.client.UserLocalConnection)
      

      However it can't be answered because all of the available Jetty threads are currently used waiting for the above thread to finish its synchronized block:

      "qtp-ambari-client-118" prio=10 tid=0x00007efbbc02b000 nid=0x135f waiting for monitor entry [0x00007efb928e4000]
         java.lang.Thread.State: BLOCKED (on object monitor)
      	at org.apache.ambari.view.hive.client.UserLocalConnection.initialValue(UserLocalConnection.java:39)
      	- waiting to lock <0x0000000798772aa8> (a org.apache.ambari.view.hive.client.UserLocalConnection)
      	at org.apache.ambari.view.hive.client.UserLocalConnection.initialValue(UserLocalConnection.java:26)
      	at org.apache.ambari.view.utils.UserLocal.get(UserLocal.java:66)
      	at org.apache.ambari.view.hive.resources.browser.HiveBrowserService.databases(HiveBrowserService.java:87)
      
      ...
      
      "qtp-ambari-client-25" prio=10 tid=0x00007efc1b235800 nid=0xaab waiting for monitor entry [0x00007efbfb7f7000]
         java.lang.Thread.State: BLOCKED (on object monitor)
      	at org.apache.ambari.view.hive.client.UserLocalConnection.initialValue(UserLocalConnection.java:39)
      	- waiting to lock <0x0000000798772aa8> (a org.apache.ambari.view.hive.client.UserLocalConnection)
      	at org.apache.ambari.view.hive.client.UserLocalConnection.initialValue(UserLocalConnection.java:26)
      	at org.apache.ambari.view.utils.UserLocal.get(UserLocal.java:66)
      	at org.apache.ambari.view.hive.resources.browser.HiveBrowserService.databases(HiveBrowserService.java:87)
      
      

      Attachments

        1. AMBARI-16146_trunk.patch
          9 kB
          Nitiraj Singh Rathore

        Issue Links

          Activity

            People

              nitiraj.rathore Nitiraj Singh Rathore
              mahadev Mahadev Konar
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: