Details
-
Sub-task
-
Status: Done
-
Major
-
Resolution: Done
-
None
-
None
-
None
Description
On the `feature/METRON-2088-support-hdp-3.1` feature branch, the Enrichment topology is unable to load the GeoIP data from HDFS when using Kerberos authentication. The Enrichment topology shows this error.
2019-10-03 18:23:18.545 o.a.h.i.Client Curator-TreeCache-0 [WARN] Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 2019-10-03 18:23:18.552 o.a.m.e.a.m.MaxMindDbUtilities Curator-TreeCache-0 [ERROR] Unable to open new database file /apps/metron/geo/default/GeoLite2-City.tar.gz java.io.IOException: DestHost:destPort metrong-1.openstacklocal:8020 , LocalHost:localPort metrong-7/172.22.74.121:0. Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_112] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_112] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_112] at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_112] at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1502) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1444) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1354) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at com.sun.proxy.$Proxy55.getBlockLocations(Unknown Source) ~[?:?] at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:317) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at com.sun.proxy.$Proxy56.getBlockLocations(Unknown Source) ~[?:?] at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:862) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:851) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:840) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1004) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:320) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:328) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:899) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.metron.enrichment.adapters.maxmind.MaxMindDatabase.update(MaxMindDatabase.java:113) ~[stormjar.jar:?] at org.apache.metron.enrichment.adapters.maxmind.geo.GeoLiteCityDatabase.updateIfNecessary(GeoLiteCityDatabase.java:142) ~[stormjar.jar:?] at org.apache.metron.enrichment.adapters.geo.GeoAdapter.updateAdapter(GeoAdapter.java:64) ~[stormjar.jar:?] at org.apache.metron.enrichment.bolt.UnifiedEnrichmentBolt.reloadCallback(UnifiedEnrichmentBolt.java:239) ~[stormjar.jar:?] at org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.reloadCallback(ConfigurationsUpdater.java:148) ~[stormjar.jar:?] at org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.update(ConfigurationsUpdater.java:77) ~[stormjar.jar:?] at org.apache.metron.zookeeper.SimpleEventListener.childEvent(SimpleEventListener.java:120) [stormjar.jar:?] at org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:685) [stormjar.jar:?] at org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:679) [stormjar.jar:?] at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92) [stormjar.jar:?] at org.apache.metron.guava.enrichment.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [stormjar.jar:?] at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84) [stormjar.jar:?] at org.apache.curator.framework.recipes.cache.TreeCache.callListeners(TreeCache.java:678) [stormjar.jar:?] at org.apache.curator.framework.recipes.cache.TreeCache.access$1400(TreeCache.java:69) [stormjar.jar:?] at org.apache.curator.framework.recipes.cache.TreeCache$4.run(TreeCache.java:790) [stormjar.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112] Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:758) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:721) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:814) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:411) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client.getConnection(Client.java:1559) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1390) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] ... 46 more Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:173) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:615) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:411) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:801) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:797) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:797) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:411) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client.getConnection(Client.java:1559) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1390) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] ... 46 more 2019-10-03 18:23:18.558 o.a.c.f.r.c.TreeCache Curator-TreeCache-0 [ERROR] java.lang.IllegalStateException: Unable to update MaxMind database at org.apache.metron.enrichment.adapters.maxmind.MaxMindDbUtilities.handleDatabaseIOException(MaxMindDbUtilities.java:81) ~[stormjar.jar:?] at org.apache.metron.enrichment.adapters.maxmind.MaxMindDatabase.update(MaxMindDatabase.java:127) ~[stormjar.jar:?] at org.apache.metron.enrichment.adapters.maxmind.geo.GeoLiteCityDatabase.updateIfNecessary(GeoLiteCityDatabase.java:142) ~[stormjar.jar:?] at org.apache.metron.enrichment.adapters.geo.GeoAdapter.updateAdapter(GeoAdapter.java:64) ~[stormjar.jar:?] at org.apache.metron.enrichment.bolt.UnifiedEnrichmentBolt.reloadCallback(UnifiedEnrichmentBolt.java:239) ~[stormjar.jar:?] at org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.reloadCallback(ConfigurationsUpdater.java:148) ~[stormjar.jar:?] at org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.update(ConfigurationsUpdater.java:77) ~[stormjar.jar:?] at org.apache.metron.zookeeper.SimpleEventListener.childEvent(SimpleEventListener.java:120) ~[stormjar.jar:?] at org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:685) [stormjar.jar:?] at org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:679) [stormjar.jar:?] at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92) [stormjar.jar:?] at org.apache.metron.guava.enrichment.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [stormjar.jar:?] at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84) [stormjar.jar:?] at org.apache.curator.framework.recipes.cache.TreeCache.callListeners(TreeCache.java:678) [stormjar.jar:?] at org.apache.curator.framework.recipes.cache.TreeCache.access$1400(TreeCache.java:69) [stormjar.jar:?] at org.apache.curator.framework.recipes.cache.TreeCache$4.run(TreeCache.java:790) [stormjar.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]EnvironmentCluster details below. 172.22.75.252 metronh-5.openstacklocal metronh-5 metronh-5.openstacklocal. 172.22.75.250 metronh-4.openstacklocal metronh-4 metronh-4.openstacklocal. 172.22.75.248 metronh-3.openstacklocal metronh-3 metronh-3.openstacklocal. 172.22.75.246 metronh-2.openstacklocal metronh-2 metronh-2.openstacklocal. 172.22.75.244 metronh-1.openstacklocal metronh-1 metronh-1.openstacklocal. Ambari node - metronh-1.openstacklocalMetron node - metronh-5.openstacklocalPEM file for SSH - attachedKnown Issue DescriptionNoneTesting ProcedureNoneAttachmentsActivityCommentsView 9 older commentsNicholas AllenOctober 10, 2019, 12:18 PMEditedAfter more testing the tgt_renew script work around is not going to work. The work around looks like the following.Kerberos authentication against HDFS from Metron's Storm topologies can fail. The Storm worker is unable to present a valid Kerberos ticket to authenticate against HDFS. This impacts the Enrichment and Batch Indexing topologies, which each interact with HDFS. To mitigate this problem, before starting the Metron topologies in a secured cluster using Kerberos authentication, the additional installation step is required. A periodic job should be scheduled to obtain and cache a Kerberos ticket. The job should be scheduled on each node hosting a Storm Supervisor.The job should run as the user ‘metron’.The job should kinit using the Metron keytab often located at /etc/security/keytabs/metron.headless.keytab.The job should be scheduled to run at least as frequently as the ticket lifetime to ensure that a ticket is always cached and available for the topologies. EditDeleteNicholas AllenOctober 10, 2019, 12:42 PMIt is especially interesting that authentication only seems to fail against HDFS, not against other systems like Kafka. We’ve only seen failures in Enrichment and Batch Indexing against HDFS. In the case of Batch Indexing, it was able to consume messages from Kafka, but unable to write them to HDFS.EditDeleteNicholas AllenOctober 10, 2019, 1:09 PMEditedI am attaching a worker log where the client JAAS is set to debug and the worker failed to authenticate with HDFS. It seems to show that the work used the metron ticket. See worker.log attached.EditDeleteNicholas AllenOctober 10, 2019, 4:44 PMI was able to capture the error with DEBUG logs on in Storm, which is rather difficult to do surprisingly (long story). See attached worker.debug.logEditDeleteNicholas AllenOctober 10, 2019, 5:04 PMEditedBased on these logs it appears that the client is trying to do simple (non-authenticated) authentication when of course the server is kerberized. That’s based on messages like “PrivilegedActionException as:metron (auth:SIMPLE)” and client isn't using kerberos. 1 2 3 4 5 6 7 8 9 2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] Get token info proto:interface org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB info:@org.apache.hadoop.security.token.TokenInfo(value=class org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSelector) 2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] tokens aren't supported for this protocol or user doesn't have one 2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] client isn't using kerberos 2019-10-10 20:30:26.084 o.a.h.s.UserGroupInformation Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedActionException as:metron (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 2019-10-10 20:30:26.084 o.a.h.s.UserGroupInformation Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedAction as:metron (auth:SIMPLE) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:721) 2019-10-10 20:30:26.085 o.a.h.i.Client Thread-7-hdfsIndexingBolt-executor[3 3] [WARN] Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 2019-10-10 20:30:26.086 o.a.h.s.UserGroupInformation Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedActionException as:metron (auth:SIMPLE) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 2019-10-10 20:30:26.086 o.a.h.i.Client Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] closing ipc connection to nicksolr-1.openstacklocal/172.22.76.204:8020: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] These messages can be matched up to the source code here.https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java EditDelete
Attachments
Issue Links
- links to