Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-12361

RU: Secure Upgrade Fails On Restarting NameNode Because kinit Is Not Called Before HDFS Command

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.1.0
    • 2.1.1
    • ambari-server
    • None

    Description

      During an HDP 2.2 to 2.3 secure upgrade, NameNode can't restart:

      2015-07-09 08:00:21,274 - hadoop-hdfs-namenode is currently at version 2.2.0.0-2041
      2015-07-09 08:00:21,314 - hadoop-hdfs-namenode is currently at version 2.2.0.0-2041
      2015-07-09 08:00:21,354 - hadoop-hdfs-namenode is currently at version 2.2.0.0-2041
      2015-07-09 08:00:21,396 - checked_call['conf-select set-conf-dir --package hadoop --stack-version 2.3.0.0-2545 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
      2015-07-09 08:00:21,437 - checked_call returned (0, '/usr/hdp/2.3.0.0-2545/hadoop/conf -> /etc/hadoop/2.3.0.0-2545/0')
      2015-07-09 08:00:21,476 - hadoop-hdfs-namenode is currently at version 2.2.0.0-2041
      2015-07-09 08:00:21,659 - checked_call['conf-select set-conf-dir --package hadoop --stack-version 2.3.0.0-2545 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
      2015-07-09 08:00:21,698 - checked_call returned (0, '/usr/hdp/2.3.0.0-2545/hadoop/conf -> /etc/hadoop/2.3.0.0-2545/0')
      2015-07-09 08:00:21,702 - Group['hadoop'] {'ignore_failures': False}
      2015-07-09 08:00:21,704 - Group['users'] {'ignore_failures': False}
      2015-07-09 08:00:21,704 - Group['knox'] {'ignore_failures': False}
      2015-07-09 08:00:21,705 - User['hive'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
      2015-07-09 08:00:21,706 - User['oozie'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['users']}
      2015-07-09 08:00:21,707 - User['ambari-qa'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['users']}
      2015-07-09 08:00:21,708 - User['flume'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
      2015-07-09 08:00:21,709 - User['hdfs'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
      2015-07-09 08:00:21,710 - User['knox'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
      2015-07-09 08:00:21,711 - User['mapred'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
      2015-07-09 08:00:21,712 - User['hbase'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
      2015-07-09 08:00:21,713 - User['tez'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['users']}
      2015-07-09 08:00:21,714 - User['zookeeper'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
      2015-07-09 08:00:21,715 - User['falcon'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['users']}
      2015-07-09 08:00:21,716 - User['sqoop'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
      2015-07-09 08:00:21,717 - User['yarn'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
      2015-07-09 08:00:21,718 - User['hcat'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
      2015-07-09 08:00:21,719 - User['ams'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': ['hadoop']}
      2015-07-09 08:00:21,720 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
      2015-07-09 08:00:21,723 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
      2015-07-09 08:00:21,733 - Skipping Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] due to not_if
      2015-07-09 08:00:21,734 - Directory['/tmp/hbase-hbase'] {'owner': 'hbase', 'recursive': True, 'mode': 0775, 'cd_access': 'a'}
      2015-07-09 08:00:21,735 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
      2015-07-09 08:00:21,736 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] {'not_if': '(test $(id -u hbase) -gt 1000) || (false)'}
      2015-07-09 08:00:21,746 - Skipping Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] due to not_if
      2015-07-09 08:00:21,746 - Group['hdfs'] {'ignore_failures': False}
      2015-07-09 08:00:21,747 - User['hdfs'] {'ignore_failures': False, 'groups': ['hadoop', 'hdfs']}
      2015-07-09 08:00:21,748 - Directory['/etc/hadoop'] {'mode': 0755}
      2015-07-09 08:00:21,769 - File['/usr/hdp/2.3.0.0-2545/hadoop/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'root', 'group': 'hadoop'}
      2015-07-09 08:00:21,786 - Execute['('setenforce', '0')'] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'}
      2015-07-09 08:00:21,824 - Directory['/grid/0/log/hadoop'] {'owner': 'root', 'mode': 0775, 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
      2015-07-09 08:00:21,826 - Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root', 'recursive': True, 'cd_access': 'a'}
      2015-07-09 08:00:21,826 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'recursive': True, 'cd_access': 'a'}
      2015-07-09 08:00:21,833 - File['/usr/hdp/2.3.0.0-2545/hadoop/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'root'}
      2015-07-09 08:00:21,835 - File['/usr/hdp/2.3.0.0-2545/hadoop/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'root'}
      2015-07-09 08:00:21,836 - File['/usr/hdp/2.3.0.0-2545/hadoop/conf/log4j.properties'] {'content': '...', 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
      2015-07-09 08:00:21,850 - File['/usr/hdp/2.3.0.0-2545/hadoop/conf/hadoop-metrics2.properties'] {'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs'}
      2015-07-09 08:00:21,850 - File['/usr/hdp/2.3.0.0-2545/hadoop/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755}
      2015-07-09 08:00:21,858 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'group': 'hadoop'}
      2015-07-09 08:00:21,859 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'mode': 0755}
      2015-07-09 08:00:22,135 - checked_call['conf-select set-conf-dir --package hadoop --stack-version 2.3.0.0-2545 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
      2015-07-09 08:00:22,177 - checked_call returned (0, '/usr/hdp/2.3.0.0-2545/hadoop/conf -> /etc/hadoop/2.3.0.0-2545/0')
      2015-07-09 08:00:22,219 - hadoop-hdfs-namenode is currently at version 2.2.0.0-2041
      2015-07-09 08:00:22,261 - hadoop-hdfs-namenode is currently at version 2.2.0.0-2041
      2015-07-09 08:00:22,305 - hadoop-hdfs-namenode is currently at version 2.2.0.0-2041
      2015-07-09 08:00:22,344 - hadoop-hdfs-namenode is currently at version 2.2.0.0-2041
      2015-07-09 08:00:22,387 - hadoop-hdfs-namenode is currently at version 2.2.0.0-2041
      2015-07-09 08:00:22,428 - checked_call['conf-select set-conf-dir --package hadoop --stack-version 2.3.0.0-2545 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
      2015-07-09 08:00:22,471 - checked_call returned (0, '/usr/hdp/2.3.0.0-2545/hadoop/conf -> /etc/hadoop/2.3.0.0-2545/0')
      2015-07-09 08:00:22,515 - hadoop-hdfs-namenode is currently at version 2.2.0.0-2041
      2015-07-09 08:00:22,577 - hadoop-hdfs-namenode is currently at version 2.2.0.0-2041
      2015-07-09 08:00:22,595 - call['hdfs haadmin -getServiceState nn1'] {'logoutput': True, 'user': 'hdfs'}
      15/07/09 08:00:25 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
      Operation failed: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "os-r6-qljsls-c2dalsec-13.openstacklocal/172.22.106.178"; destination host is: "os-r6-qljsls-c2dalsec-13.openstacklocal":8020; 
      2015-07-09 08:00:25,350 - call returned (255, '15/07/09 08:00:25 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]\nOperation failed: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "os-r6-qljsls-c2dalsec-13.openstacklocal/172.22.106.178"; destination host is: "os-r6-qljsls-c2dalsec-13.openstacklocal":8020; ')
      

      Attachments

        Issue Links

          Activity

            People

              jonathanhurley Jonathan Hurley
              jonathanhurley Jonathan Hurley
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: