Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14631

HiveServer2 regularly fails to connect to metastore

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.2.1, 2.0.0, 2.1.0
    • Fix Version/s: None
    • Component/s: HiveServer2
    • Labels:
      None
    • Environment:

      Hive 2.1.0, Hue 3.10.0, Hadoop 2.7.2, Tez 0.8.3

      Description

      I have a cluster secured with Kerberos and Hive is configured to work with Tez by default. Everything works well through hive-cli and beeline; however, I'm facing a strange behavior through Hue.
      I can have a lot of client connections (these can reach 600) and after a day, the client connections fail. But this is not the case for all clients connection attempts.

      When it fails, I have the following logs on the HiveServer2:

      Aug  3 09:28:04 hiveserver2.bigdata.fr Executing command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112): INSERT INTO TABLE shfs3453.camille_test VALUES ('coucou')
      Aug  3 09:28:04 hiveserver2.bigdata.fr Query ID = hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112
      Aug  3 09:28:04 hiveserver2.bigdata.fr Total jobs = 1
      Aug  3 09:28:04 hiveserver2.bigdata.fr Launching Job 1 out of 1
      Aug  3 09:28:04 hiveserver2.bigdata.fr Starting task [Stage-1:MAPRED] in parallel
      Aug  3 09:28:04 hiveserver2.bigdata.fr Trying to connect to metastore with URI thrift://metastore01.bigdata.fr:9083
      Aug  3 09:28:04 hiveserver2.bigdata.fr Failed to connect to the MetaStore Server...
      Aug  3 09:28:04 hiveserver2.bigdata.fr Waiting 1 seconds before next connection attempt.
      Aug  3 09:28:05 hiveserver2.bigdata.fr Trying to connect to metastore with URI thrift://metastore01.bigdata.fr:9083
      Aug  3 09:28:05 hiveserver2.bigdata.fr Failed to connect to the MetaStore Server...
      Aug  3 09:28:05 hiveserver2.bigdata.fr Waiting 1 seconds before next connection attempt.
      Aug  3 09:28:06 hiveserver2.bigdata.fr Trying to connect to metastore with URI thrift://metastore01.bigdata.fr:9083
      Aug  3 09:28:06 hiveserver2.bigdata.fr Failed to connect to the MetaStore Server...
      Aug  3 09:28:06 hiveserver2.bigdata.fr Waiting 1 seconds before next connection attempt.
      Aug  3 09:28:08 hiveserver2.bigdata.fr FAILED: Execution Error, return code -1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
      Aug  3 09:28:08 hiveserver2.bigdata.fr Completed executing command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112); Time taken: 4.002 seconds
      

      At the same time I have the following logs on the Metastore are:

      Aug  3 09:28:03 metastore01.bigdata.fr 180: get_table : db=shfs3453 tbl=camille_test
      Aug  3 09:28:03 metastore01.bigdata.fr ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 tbl=camille_test#011
      Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 tbl=camille_test
      Aug  3 09:28:04 metastore01.bigdata.fr ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 tbl=camille_test#011
      Aug  3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 tbl=camille_test
      Aug  3 09:28:04 metastore01.bigdata.fr ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 tbl=camille_test#011
      Aug  3 09:28:04 metastore01.bigdata.fr SASL negotiation failure
      Aug  3 09:28:04 metastore01.bigdata.fr Error occurred during processing of message.
      Aug  3 09:28:05 metastore01.bigdata.fr SASL negotiation failure
      Aug  3 09:28:05 metastore01.bigdata.fr Error occurred during processing of message.
      Aug  3 09:28:06 metastore01.bigdata.fr SASL negotiation failure
      Aug  3 09:28:06 metastore01.bigdata.fr Error occurred during processing of message.
      

      To solve the connections issue, I have to restart the HiveServer2.

      Note: I also created a JIRA for Hue: https://issues.cloudera.org/browse/HUE-4748

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              BigDataOrange Alexandre Linte
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: