Uploaded image for project: 'Metron (Retired)'
  1. Metron (Retired)
  2. METRON-2088 Support HDP 3.1.0
  3. METRON-2297

Enrichment Topology Unable to Load Geo IP Data from HDFS

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Done
    • Major
    • Resolution: Done
    • None
    • None
    • None

    Description

      On the `feature/METRON-2088-support-hdp-3.1` feature branch, the Enrichment topology is unable to load the GeoIP data from HDFS when using Kerberos authentication.  The Enrichment topology shows this error.

      2019-10-03 18:23:18.545 o.a.h.i.Client Curator-TreeCache-0 [WARN] Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
      2019-10-03 18:23:18.552 o.a.m.e.a.m.MaxMindDbUtilities Curator-TreeCache-0 [ERROR] Unable to open new database file /apps/metron/geo/default/GeoLite2-City.tar.gz
      java.io.IOException: DestHost:destPort metrong-1.openstacklocal:8020 , LocalHost:localPort metrong-7/172.22.74.121:0. Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_112]
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_112]
      	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_112]
      	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_112]
      	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1502) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1444) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1354) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at com.sun.proxy.$Proxy55.getBlockLocations(Unknown Source) ~[?:?]
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:317) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112]
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112]
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
      	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at com.sun.proxy.$Proxy56.getBlockLocations(Unknown Source) ~[?:?]
      	at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:862) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:851) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:840) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1004) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:320) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:328) ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:899) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.metron.enrichment.adapters.maxmind.MaxMindDatabase.update(MaxMindDatabase.java:113) ~[stormjar.jar:?]
      	at org.apache.metron.enrichment.adapters.maxmind.geo.GeoLiteCityDatabase.updateIfNecessary(GeoLiteCityDatabase.java:142) ~[stormjar.jar:?]
      	at org.apache.metron.enrichment.adapters.geo.GeoAdapter.updateAdapter(GeoAdapter.java:64) ~[stormjar.jar:?]
      	at org.apache.metron.enrichment.bolt.UnifiedEnrichmentBolt.reloadCallback(UnifiedEnrichmentBolt.java:239) ~[stormjar.jar:?]
      	at org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.reloadCallback(ConfigurationsUpdater.java:148) ~[stormjar.jar:?]
      	at org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.update(ConfigurationsUpdater.java:77) ~[stormjar.jar:?]
      	at org.apache.metron.zookeeper.SimpleEventListener.childEvent(SimpleEventListener.java:120) [stormjar.jar:?]
      	at org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:685) [stormjar.jar:?]
      	at org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:679) [stormjar.jar:?]
      	at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92) [stormjar.jar:?]
      	at org.apache.metron.guava.enrichment.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [stormjar.jar:?]
      	at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84) [stormjar.jar:?]
      	at org.apache.curator.framework.recipes.cache.TreeCache.callListeners(TreeCache.java:678) [stormjar.jar:?]
      	at org.apache.curator.framework.recipes.cache.TreeCache.access$1400(TreeCache.java:69) [stormjar.jar:?]
      	at org.apache.curator.framework.recipes.cache.TreeCache$4.run(TreeCache.java:790) [stormjar.jar:?]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112]
      	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
      Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
      	at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:758) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
      	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:721) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:814) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:411) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1559) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1390) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	... 46 more
      Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
      	at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:173) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:615) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:411) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:801) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:797) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
      	at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:797) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:411) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1559) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1390) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
      	... 46 more
      2019-10-03 18:23:18.558 o.a.c.f.r.c.TreeCache Curator-TreeCache-0 [ERROR] 
      java.lang.IllegalStateException: Unable to update MaxMind database
      	at org.apache.metron.enrichment.adapters.maxmind.MaxMindDbUtilities.handleDatabaseIOException(MaxMindDbUtilities.java:81) ~[stormjar.jar:?]
      	at org.apache.metron.enrichment.adapters.maxmind.MaxMindDatabase.update(MaxMindDatabase.java:127) ~[stormjar.jar:?]
      	at org.apache.metron.enrichment.adapters.maxmind.geo.GeoLiteCityDatabase.updateIfNecessary(GeoLiteCityDatabase.java:142) ~[stormjar.jar:?]
      	at org.apache.metron.enrichment.adapters.geo.GeoAdapter.updateAdapter(GeoAdapter.java:64) ~[stormjar.jar:?]
      	at org.apache.metron.enrichment.bolt.UnifiedEnrichmentBolt.reloadCallback(UnifiedEnrichmentBolt.java:239) ~[stormjar.jar:?]
      	at org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.reloadCallback(ConfigurationsUpdater.java:148) ~[stormjar.jar:?]
      	at org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.update(ConfigurationsUpdater.java:77) ~[stormjar.jar:?]
      	at org.apache.metron.zookeeper.SimpleEventListener.childEvent(SimpleEventListener.java:120) ~[stormjar.jar:?]
      	at org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:685) [stormjar.jar:?]
      	at org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:679) [stormjar.jar:?]
      	at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92) [stormjar.jar:?]
      	at org.apache.metron.guava.enrichment.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [stormjar.jar:?]
      	at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84) [stormjar.jar:?]
      	at org.apache.curator.framework.recipes.cache.TreeCache.callListeners(TreeCache.java:678) [stormjar.jar:?]
      	at org.apache.curator.framework.recipes.cache.TreeCache.access$1400(TreeCache.java:69) [stormjar.jar:?]
      	at org.apache.curator.framework.recipes.cache.TreeCache$4.run(TreeCache.java:790) [stormjar.jar:?]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112]
      	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]EnvironmentCluster details below. 172.22.75.252	metronh-5.openstacklocal	metronh-5	metronh-5.openstacklocal.
      172.22.75.250	metronh-4.openstacklocal	metronh-4	metronh-4.openstacklocal.
      172.22.75.248	metronh-3.openstacklocal	metronh-3	metronh-3.openstacklocal.
      172.22.75.246	metronh-2.openstacklocal	metronh-2	metronh-2.openstacklocal.
      172.22.75.244	metronh-1.openstacklocal	metronh-1	metronh-1.openstacklocal. Ambari node - metronh-1.openstacklocalMetron node - metronh-5.openstacklocalPEM file for SSH - attachedKnown Issue DescriptionNoneTesting ProcedureNoneAttachmentsActivityCommentsView 9 older commentsNicholas AllenOctober 10, 2019, 12:18 PMEditedAfter more testing the tgt_renew script work around is not going to work. The work around looks like the following.Kerberos authentication against HDFS from Metron's Storm topologies can fail. The Storm worker is unable to present a valid Kerberos ticket to authenticate against HDFS. This impacts the Enrichment and Batch Indexing topologies, which each interact with HDFS.
      
      To mitigate this problem, before starting the Metron topologies in a secured cluster using Kerberos authentication, the additional installation step is required. A periodic job should be scheduled to obtain and cache a Kerberos ticket.
      The job should be scheduled on each node hosting a Storm Supervisor.The job should run as the user ‘metron’.The job should kinit using the Metron keytab often located at /etc/security/keytabs/metron.headless.keytab.The job should be scheduled to run at least as frequently as the ticket lifetime to ensure that a ticket is always cached and available for the topologies. EditDeleteNicholas AllenOctober 10, 2019, 12:42 PMIt is especially interesting that authentication only seems to fail against HDFS, not against other systems like Kafka.  We’ve only seen failures in Enrichment and Batch Indexing against HDFS.  In the case of Batch Indexing, it was able to consume messages from Kafka, but unable to write them to HDFS.EditDeleteNicholas AllenOctober 10, 2019, 1:09 PMEditedI am attaching a worker log where the client JAAS is set to debug and the worker failed to authenticate with HDFS.  It seems to show that the work used the metron ticket. See worker.log attached.EditDeleteNicholas AllenOctober 10, 2019, 4:44 PMI was able to capture the error with DEBUG logs on in Storm, which is rather difficult to do surprisingly (long story).  See attached worker.debug.logEditDeleteNicholas AllenOctober 10, 2019, 5:04 PMEditedBased on these logs it appears that the client is trying to do simple (non-authenticated) authentication when of course the server is kerberized.  That’s based on messages like “PrivilegedActionException as:metron (auth:SIMPLE)” and client isn't using kerberos. 1
      2
      3
      4
      5
      6
      7
      8
      9
      2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] Get token info proto:interface org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB info:@org.apache.hadoop.security.token.TokenInfo(value=class org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSelector)
      2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] tokens aren't supported for this protocol or user doesn't have one
      2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] client isn't using kerberos
      2019-10-10 20:30:26.084 o.a.h.s.UserGroupInformation Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedActionException as:metron (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
      2019-10-10 20:30:26.084 o.a.h.s.UserGroupInformation Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedAction as:metron (auth:SIMPLE) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:721)
      2019-10-10 20:30:26.085 o.a.h.i.Client Thread-7-hdfsIndexingBolt-executor[3 3] [WARN] Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
      2019-10-10 20:30:26.086 o.a.h.s.UserGroupInformation Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedActionException as:metron (auth:SIMPLE) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
      2019-10-10 20:30:26.086 o.a.h.i.Client Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] closing ipc connection to nicksolr-1.openstacklocal/172.22.76.204:8020: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
      java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] These messages can be matched up to the source code here.https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java EditDelete
      

      Attachments

        Issue Links

          Activity

            People

              nickwallen Nick Allen
              nickwallen Nick Allen
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m