Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-13295

ACCUMULO_TRACER START failed after enabling Kerberos

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1.2
    • 2.2.0
    • ambari-server
    • None

    Description

      After enabling Kerberos on the "Start and Test Services" step ACCUMULO_TRACER START failed.

      "stderr" : "Python script has been killed due to timeout after waiting 180 secs",
      
      "stdout" : "2015-09-25 14:42:53,963 - Group['custom-spark'] {}\n2015-09-25 14:42:53,964 - Group['hadoop'] {}\n2015-09-25 14:42:53,965 - Group['custom-users'] {}\n2015-09-25 14:42:53,965 - Group['custom-knox-group'] {}\n2015-09-25 14:42:53,965 - User['custom-sqoop'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,966 - User['custom-knox'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,967 - User['custom-hdfs'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,968 - User['custom-oozie'] {'gid': 'hadoop', 'groups': [u'custom-users']}\n2015-09-25 14:42:53,969 - User['custom-smoke'] {'gid': 'hadoop', 'groups': [u'custom-users']}\n2015-09-25 14:42:53,970 - User['custom-hbase'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,971 - User['custom-tez'] {'gid': 'hadoop', 'groups': [u'custom-users']}\n2015-09-25 14:42:53,972 - User['custom-hive'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,973 - User['custom-mr'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,973 - User['custom-accumulo'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,974 - User['custom-hcat'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,975 - User['custom-ams'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,976 - User['custom-yarn'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,977 - User['custom-falcon'] {'gid': 'hadoop', 'groups': [u'custom-users']}\n2015-09-25 14:42:53,977 - User['custom-spark'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,978 - User['custom-atlas'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,979 - User['custom-flume'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,980 - User['custom-kafka'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,981 - User['custom-zookeeper'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,982 - User['custom-mahout'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,982 - User['custom-storm'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,983 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}\n2015-09-25 14:42:53,985 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh custom-smoke /tmp/hadoop-custom-smoke,/tmp/hsperfdata_custom-smoke,/home/custom-smoke,/tmp/custom-smoke,/tmp/sqoop-custom-smoke'] {'not_if': '(test $(id -u custom-smoke) -gt 1000) || (false)'}\n2015-09-25 14:42:53,991 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh custom-smoke /tmp/hadoop-custom-smoke,/tmp/hsperfdata_custom-smoke,/home/custom-smoke,/tmp/custom-smoke,/tmp/sqoop-custom-smoke'] due to not_if\n2015-09-25 14:42:53,991 - Directory['/tmp/hbase-hbase'] {'owner': 'custom-hbase', 'recursive': True, 'mode': 0775, 'cd_access': 'a'}\n2015-09-25 14:42:53,992 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}\n2015-09-25 14:42:53,993 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh custom-hbase /home/custom-hbase,/tmp/custom-hbase,/usr/bin/custom-hbase,/var/log/custom-hbase,/tmp/hbase-hbase'] {'not_if': '(test $(id -u custom-hbase) -gt 1000) || (false)'}\n2015-09-25 14:42:53,999 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh custom-hbase /home/custom-hbase,/tmp/custom-hbase,/usr/bin/custom-hbase,/var/log/custom-hbase,/tmp/hbase-hbase'] due to not_if\n2015-09-25 14:42:54,000 - Group['custom-hdfs'] {'ignore_failures': False}\n2015-09-25 14:42:54,000 - User['custom-hdfs'] {'ignore_failures': False, 'groups': [u'hadoop', u'custom-hdfs']}\n2015-09-25 14:42:54,001 - Directory['/etc/hadoop'] {'mode': 0755}\n2015-09-25 14:42:54,019 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'root', 'group': 'hadoop'}\n2015-09-25 14:42:54,019 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'custom-hdfs', 'group': 'hadoop', 'mode': 0777}\n2015-09-25 14:42:54,032 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'}\n2015-09-25 14:42:54,039 - Skipping Execute[('setenforce', '0')] due to not_if\n2015-09-25 14:42:54,040 - Directory['/grid/0/log/hadoop'] {'owner': 'root', 'mode': 0775, 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}\n2015-09-25 14:42:54,043 - Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root', 'recursive': True, 'cd_access': 'a'}\n2015-09-25 14:42:54,043 - Directory['/tmp/hadoop-custom-hdfs'] {'owner': 'custom-hdfs', 'recursive': True, 'cd_access': 'a'}\n2015-09-25 14:42:54,048 - File['/usr/hdp/current/hadoop-client/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'root'}\n2015-09-25 14:42:54,051 - File['/usr/hdp/current/hadoop-client/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'root'}\n2015-09-25 14:42:54,051 - File['/usr/hdp/current/hadoop-client/conf/log4j.properties'] {'content': ..., 'owner': 'custom-hdfs', 'group': 'hadoop', 'mode': 0644}\n2015-09-25 14:42:54,074 - File['/usr/hdp/current/hadoop-client/conf/hadoop-metrics2.properties'] {'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'custom-hdfs'}\n2015-09-25 14:42:54,075 - File['/usr/hdp/current/hadoop-client/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755}\n2015-09-25 14:42:54,076 - File['/usr/hdp/current/hadoop-client/conf/configuration.xsl'] {'owner': 'custom-hdfs', 'group': 'hadoop'}\n2015-09-25 14:42:54,083 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'custom-hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop'}\n2015-09-25 14:42:54,089 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755}\n2015-09-25 14:42:54,275 - Directory['/usr/hdp/current/accumulo-tracer/conf'] {'owner': 'custom-accumulo', 'group': 'hadoop', 'recursive': True, 'mode': 0755}\n2015-09-25 14:42:54,277 - Directory['/usr/hdp/current/accumulo-tracer/conf/server'] {'owner': 'custom-accumulo', 'group': 'hadoop', 'recursive': True, 'mode': 0700}\n2015-09-25 14:42:54,278 - XmlConfig['accumulo-site.xml'] {'group': 'hadoop', 'conf_dir': '/usr/hdp/current/accumulo-tracer/conf/server', 'mode': 0600, 'configuration_attributes': {}, 'owner': 'custom-accumulo', 'configurations': ...}\n2015-09-25 14:42:54,292 - Generating config: /usr/hdp/current/accumulo-tracer/conf/server/accumulo-site.xml\n2015-09-25 14:42:54,293 - File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-site.xml'] {'owner': 'custom-accumulo', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 0600, 'encoding': 'UTF-8'}\n2015-09-25 14:42:54,317 - Directory['/var/run/accumulo'] {'owner': 'custom-accumulo', 'group': 'hadoop', 'recursive': True}\n2015-09-25 14:42:54,318 - Directory['/grid/0/log/accumulo'] {'owner': 'custom-accumulo', 'group': 'hadoop', 'recursive': True}\n2015-09-25 14:42:54,323 - File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-env.sh'] {'content': InlineTemplate(...), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': 0644}\n2015-09-25 14:42:54,324 - PropertiesFile['/usr/hdp/current/accumulo-tracer/conf/server/client.conf'] {'owner': 'custom-accumulo', 'group': 'hadoop', 'properties': {'instance.zookeeper.host': u'ambari-ooziehive-r1-2.novalocal:2181,ambari-ooziehive-r1-3.novalocal:2181,ambari-ooziehive-r1-5.novalocal:2181', 'instance.name': u'hdp-accumulo-instance', 'instance.rpc.sasl.enabled': True, 'instance.zookeeper.timeout': u'30s'}}\n2015-09-25 14:42:54,329 - Generating properties file: /usr/hdp/current/accumulo-tracer/conf/server/client.conf\n2015-09-25 14:42:54,329 - File['/usr/hdp/current/accumulo-tracer/conf/server/client.conf'] {'owner': 'custom-accumulo', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,332 - Writing File['/usr/hdp/current/accumulo-tracer/conf/server/client.conf'] because contents don't match\n2015-09-25 14:42:54,333 - File['/usr/hdp/current/accumulo-tracer/conf/server/log4j.properties'] {'content': ..., 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': 0644}\n2015-09-25 14:42:54,333 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/auditLog.xml'] {'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,337 - File['/usr/hdp/current/accumulo-tracer/conf/server/auditLog.xml'] {'content': Template('auditLog.xml.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,337 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/generic_logger.xml'] {'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,341 - File['/usr/hdp/current/accumulo-tracer/conf/server/generic_logger.xml'] {'content': Template('generic_logger.xml.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,342 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/monitor_logger.xml'] {'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,344 - File['/usr/hdp/current/accumulo-tracer/conf/server/monitor_logger.xml'] {'content': Template('monitor_logger.xml.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,345 - File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-metrics.xml'] {'content': StaticFile('accumulo-metrics.xml'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': 0644}\n2015-09-25 14:42:54,346 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/tracers'] {'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,348 - File['/usr/hdp/current/accumulo-tracer/conf/server/tracers'] {'content': Template('tracers.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,349 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/gc'] {'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,351 - File['/usr/hdp/current/accumulo-tracer/conf/server/gc'] {'content': Template('gc.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,352 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/monitor'] {'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,354 - File['/usr/hdp/current/accumulo-tracer/conf/server/monitor'] {'content': Template('monitor.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,355 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/slaves'] {'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,357 - File['/usr/hdp/current/accumulo-tracer/conf/server/slaves'] {'content': Template('slaves.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,357 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/masters'] {'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,359 - File['/usr/hdp/current/accumulo-tracer/conf/server/masters'] {'content': Template('masters.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,360 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/hadoop-metrics2-accumulo.properties'] {'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,368 - File['/usr/hdp/current/accumulo-tracer/conf/server/hadoop-metrics2-accumulo.properties'] {'content': Template('hadoop-metrics2-accumulo.properties.j2'), 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,369 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/accumulo.headless.keytab custom-accumulo@EXAMPLE.COM; ACCUMULO_CONF_DIR=/usr/hdp/current/accumulo-tracer/conf/server /usr/hdp/current/accumulo-client/bin/accumulo init --reset-security --user custom-accumulo@EXAMPLE.COM --password NA >/grid/0/log/accumulo/accumulo-reset.out 2>/grid/0/log/accumulo/accumulo-reset.err'] {'not_if': 'ambari-sudo.sh su custom-accumulo -l -s /bin/bash -c \\'/usr/bin/kinit -kt /etc/security/keytabs/accumulo.headless.keytab custom-accumulo@EXAMPLE.COM; ACCUMULO_CONF_DIR=/usr/hdp/current/accumulo-tracer/conf/server /usr/hdp/current/accumulo-client/bin/accumulo shell -e \"userpermissions -u custom-accumulo@EXAMPLE.COM\" | grep System.CREATE_TABLE\\'', 'user': 'custom-accumulo'}",
      

      tserver log contains the following exceptions

      2015-09-25 14:29:38,821 [tserver.TabletServer] INFO : Started replication service on ambari-ooziehive-r1-2.novalocal:10002
      2015-09-25 14:29:55,489 [server.TThreadPoolServer] ERROR: Error occurred during processing of message.
      java.lang.RuntimeException: org.apache.thrift.transport.TTransportException
      	at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
      	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
      	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:360)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
      	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
      	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.thrift.transport.TTransportException
      	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
      	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
      	at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178)
      	at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
      	at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
      	at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
      	at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
      	... 11 more
      2015-09-25 14:30:01,812 [tserver.TabletServer] INFO : Loading tablet !0<;~
      2015-09-25 14:30:01,894 [tserver.TabletServer] INFO : ambari-ooziehive-r1-2.novalocal:9997: got assignment from master: !0<;~
      2015-09-25 14:30:02,833 [util.MetadataTableUtil] INFO : Scanning logging entries for !0<;~
      2015-09-25 14:30:02,862 [util.MetadataTableUtil] INFO : Scanning metadata for logs used for tablet !0<;~
      2015-09-25 14:30:02,924 [util.MetadataTableUtil] INFO : Returning logs [] for extent !0<;~
      2015-09-25 14:30:34,637 [server.TThreadPoolServer] ERROR: Error occurred during processing of message.
      java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS initiate failed
      	at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
      	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
      	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:360)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
      	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
      	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS initiate failed
      	at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190)
      	at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
      	at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
      	at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
      	at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
      	... 11 more
      

      Live (another 48 hours) cluster which happened fail:
      172.22.90.201 ambari-ooziehive-r1-5.novalocal ambari-ooziehive-r1-5
      172.22.90.200 ambari-ooziehive-r1-2.novalocal ambari-ooziehive-r1-2
      172.22.90.198 ambari-ooziehive-r1-3.novalocal ambari-ooziehive-r1-3
      172.22.90.197 ambari-ooziehive-r1-4.novalocal ambari-ooziehive-r1-4
      172.22.90.199 ambari-ooziehive-r1-1.novalocal ambari-ooziehive-r1-1

      Attachments

        1. AMBARI-13295.patch
          6 kB
          Dmitry Lysnichenko

        Issue Links

          Activity

            People

              dmitriusan Dmitry Lysnichenko
              dmitriusan Dmitry Lysnichenko
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: