Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
3.3.0
-
None
-
None
Description
MapReduce Job tasks fails. There are few tasks which fails with below exception and few hangs and then times out. List Files on S3 works fine from hadoop client.
Exception from failed task:
2019-05-30 20:23:05,424 ERROR [IPC Server handler 19 on 35791] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1559246386193_0001_m_000000_0 - exited : org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on qe-cloudstorage-bucket: com.amazonaws.SdkClientException: Unable to execute HTTP request: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed: Unable to execute HTTP request: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:204) at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111) at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:314) at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:406) at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:310) at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:285) at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:444) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:350) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:161) at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:117) at org.apache.hadoop.examples.terasort.TeraOutputFormat.getOutputCommitter(TeraOutputFormat.java:152) at org.apache.hadoop.mapred.Task.initialize(Task.java:606) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1116) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1066) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4368) at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:5129) at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:5103) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4352) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4315) at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1344) at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1284) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:445) at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) ... 22 more Caused by: javax.net.ssl.SSLException: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed at org.wildfly.openssl.OpenSSLEngine.unwrap(OpenSSLEngine.java:543) at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624) at org.wildfly.openssl.OpenSSLSocket.runHandshake(OpenSSLSocket.java:319) at org.wildfly.openssl.OpenSSLSocket.startHandshake(OpenSSLSocket.java:210) at com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:396) at com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:355) at com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76) at com.amazonaws.http.conn.$Proxy17.connect(Unknown Source) at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381) at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237) at com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1238) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1058) ... 37 more
ThreadDump of Hanging MapTask:
"main" #1 prio=5 os_prio=0 tid=0x00007ff424064800 nid=0x15109 waiting on condition [0x00007ff42c3fc000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doPauseBeforeRetry(AmazonHttpClient.java:1679) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.pauseBeforeRetry(AmazonHttpClient.java:1653) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1191) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1058) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4368) at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:5129) at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:5103) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4352) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4315) at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1344) at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1284) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:445) at org.apache.hadoop.fs.s3a.S3AFileSystem$$Lambda$19/350413251.execute(Unknown Source) at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:314) at org.apache.hadoop.fs.s3a.Invoker$$Lambda$20/253767021.execute(Unknown Source) at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:406) at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:310) at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:285) at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:444) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:350) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:161) at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:117) at org.apache.hadoop.examples.terasort.TeraOutputFormat.getOutputCommitter(TeraOutputFormat.java:152) at org.apache.hadoop.mapred.Task.initialize(Task.java:606) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Attachments
Issue Links
- is caused by
-
HADOOP-16050 S3A SSL connections should use OpenSSL
- Resolved
- is depended upon by
-
HADOOP-16346 Stabilize S3A OpenSSL support
- Resolved
- is related to
-
HADOOP-16405 Upgrade Wildfly Openssl version to 1.0.7.Final
- Resolved