Details
Description
Seen a test failure where a MR job failed to submit.
RM log has:
2018-04-30 15:00:17,864 WARN org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer: Unable to add the application to the delegation token renewer. java.lang.IllegalArgumentException: Invalid token service IP_ADDR:16000 at org.apache.hadoop.util.KMSUtil.createKeyProviderFromTokenService(KMSUtil.java:237) at org.apache.hadoop.crypto.key.kms.KMSTokenRenewer.createKeyProvider(KMSTokenRenewer.java:100) at org.apache.hadoop.crypto.key.kms.KMSTokenRenewer.renew(KMSTokenRenewer.java:57) at org.apache.hadoop.security.token.Token.renew(Token.java:414) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:590) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:587) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:585) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:463) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$800(DelegationTokenRenewer.java:79) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:894) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:871) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
while client log has
18/04/30 15:53:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1525128478242_0001 18/04/30 15:53:28 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:ns1, Ident: (token for systest: HDFS_DELEGATION_TOKEN owner=systest@EXAMPLE.COM, renewer=yarn, realUser=, issueDate=1525128807236, maxDate=1525733607236, sequenceNumber=1038, masterKeyId=20) 18/04/30 15:53:28 INFO mapreduce.JobSubmitter: Kind: HBASE_AUTH_TOKEN, Service: 621a942b-292f-493d-ba50-f9b783704359, Ident: (org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@0) 18/04/30 15:53:28 INFO mapreduce.JobSubmitter: Kind: KMS_DELEGATION_TOKEN, Service: IP_ADDR:16000, Ident: 00 07 73 79 73 74 65 73 74 04 79 61 72 6e 00 8a 01 63 18 c2 c3 d5 8a 01 63 3c cf 47 d5 8e 01 ec 10 18/04/30 15:53:29 INFO mapreduce.JobSubmitter: Cleaning up the staging area /user/systest/.staging/job_1525128478242_0001 18/04/30 15:53:29 WARN security.UserGroupInformation: PriviledgedActionException as:systest@EXAMPLE.COM (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1525128478242_0001 to YARN : Invalid token service IP_ADDR:16000 18/04/30 15:53:29 INFO client.ConnectionManager$HConnectionImplementation: Closing master protocol: MasterService 18/04/30 15:53:29 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x1630ba2d0001cb5 18/04/30 15:53:29 INFO zookeeper.ZooKeeper: Session: 0x1630ba2d0001cb5 closed 18/04/30 15:53:29 INFO zookeeper.ClientCnxn: EventThread shut down 18/04/30 15:53:29 ERROR util.AbstractHBaseTool: Error running command-line tool java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1525128478242_0001 to YARN : Invalid token service IP_ADDR:16000 at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:336) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:244) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1325) at org.apache.hadoop.hbase.mapreduce.IntegrationTestBulkLoad.runLinkedListMRJob(IntegrationTestBulkLoad.java:298) at org.apache.hadoop.hbase.mapreduce.IntegrationTestBulkLoad.runLoad(IntegrationTestBulkLoad.java:225) at org.apache.hadoop.hbase.mapreduce.IntegrationTestBulkLoad.testBulkLoad(IntegrationTestBulkLoad.java:215) at org.apache.hadoop.hbase.mapreduce.IntegrationTestBulkLoad.runTestFromCommandLine(IntegrationTestBulkLoad.java:767) at org.apache.hadoop.hbase.IntegrationTestBase.doWork(IntegrationTestBase.java:123) at org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:112) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at com.cloudera.itest.hbase.smoke.TestBulkLoad.testBulkLoad(TestBulkLoad.java:46) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:367) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:274) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:161) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:290) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:242) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:121) Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1525128478242_0001 to YARN : Invalid token service IP_ADDR:16000 at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:257) at org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:290) at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:320) ... 41 more
Further debugging shows that this fall into the category of:
Server + renewer with HADOOP-14445, submitter without HADOOP-14445.
Unfortunately, my testing steps in HADOOP-14445 did not cover this scenario.
Attachments
Attachments
Issue Links
- is broken by
-
HADOOP-14445 Use DelegationTokenIssuer to create KMS delegation tokens that can authenticate to all KMS instances
- Resolved