Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
1.14.0
-
None
-
None
-
Cluster:
3 × m5.large AWS EC2 instances with Ubuntu 16.04.
Java:
openjdk version "1.8.0_181" OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-0ubuntu0.16.04.1-b13) OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
Drill:
#Generated by Git-Commit-Id-Plugin #Tue Jul 31 17:18:08 PDT 2018 git.commit.id.abbrev=0508a12 git.commit.user.email=bben-zvi@mapr.com git.commit.message.full=[maven-release-plugin] prepare release drill-1.14.0\n git.commit.id=0508a128853ce796ca7e99e13008e49442f83147 git.commit.message.short=[maven-release-plugin] prepare release drill-1.14.0 git.commit.user.name=Ben-Zvi git.build.user.name=Ben-Zvi git.commit.id.describe=drill-1.14.0 git.build.user.email=bben-zvi@mapr.com git.branch=0508a128853ce796ca7e99e13008e49442f83147 git.commit.time=31.07.2018 @ 16\:50\:38 PDT git.build.time=31.07.2018 @ 17\:18\:08 PDT git.remote.origin.url=https\://github.com/apache/drill.git
Development:
Java:
openjdk version "1.8.0_181" OpenJDK Runtime Environment (build 1.8.0_181-b13) OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode))
Drill:
#Generated by Git-Commit-Id-Plugin #Tue Jul 31 17:18:08 PDT 2018 git.commit.id.abbrev=0508a12 git.commit.user.email=bben-zvi@mapr.com git.commit.message.full=[maven-release-plugin] prepare release drill-1.14.0\n git.commit.id=0508a128853ce796ca7e99e13008e49442f83147 git.commit.message.short=[maven-release-plugin] prepare release drill-1.14.0 git.commit.user.name=Ben-Zvi git.build.user.name=Ben-Zvi git.commit.id.describe=drill-1.14.0 git.build.user.email=bben-zvi@mapr.com git.branch=0508a128853ce796ca7e99e13008e49442f83147 git.commit.time=31.07.2018 @ 16\:50\:38 PDT git.build.time=31.07.2018 @ 17\:18\:08 PDT git.remote.origin.url=https\://github.com/apache/drill.git
Cluster: 3 × m5.large AWS EC2 instances with Ubuntu 16.04. Java: openjdk version "1.8.0_181" OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-0ubuntu0.16.04.1-b13) OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode) Drill: #Generated by Git-Commit-Id-Plugin #Tue Jul 31 17:18:08 PDT 2018 git.commit.id.abbrev=0508a12 git.commit.user.email=bben-zvi@mapr.com git.commit.message.full=[maven-release-plugin] prepare release drill-1.14.0\n git.commit.id=0508a128853ce796ca7e99e13008e49442f83147 git.commit.message. short =[maven-release-plugin] prepare release drill-1.14.0 git.commit.user.name=Ben-Zvi git.build.user.name=Ben-Zvi git.commit.id.describe=drill-1.14.0 git.build.user.email=bben-zvi@mapr.com git.branch=0508a128853ce796ca7e99e13008e49442f83147 git.commit.time=31.07.2018 @ 16\:50\:38 PDT git.build.time=31.07.2018 @ 17\:18\:08 PDT git.remote.origin.url=https\: //github.com/apache/drill.git Development: Java: openjdk version "1.8.0_181" OpenJDK Runtime Environment (build 1.8.0_181-b13) OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)) Drill: #Generated by Git-Commit-Id-Plugin #Tue Jul 31 17:18:08 PDT 2018 git.commit.id.abbrev=0508a12 git.commit.user.email=bben-zvi@mapr.com git.commit.message.full=[maven-release-plugin] prepare release drill-1.14.0\n git.commit.id=0508a128853ce796ca7e99e13008e49442f83147 git.commit.message. short =[maven-release-plugin] prepare release drill-1.14.0 git.commit.user.name=Ben-Zvi git.build.user.name=Ben-Zvi git.commit.id.describe=drill-1.14.0 git.build.user.email=bben-zvi@mapr.com git.branch=0508a128853ce796ca7e99e13008e49442f83147 git.commit.time=31.07.2018 @ 16\:50\:38 PDT git.build.time=31.07.2018 @ 17\:18\:08 PDT git.remote.origin.url=https\: //github.com/apache/drill.git
Description
I have an AWS S3 bucket in us-east-1 with hundreds of thousands of Parquet files named like ${UUID}.parquet. Originally they hold ~50K records and are ~25MiB in size each, but the issue is reproducible even with a single 96KiB file with two records. The records are thin wrappers around OpenRTB's BidRequest.
I have configured a Drill cluster (no Hadoop) with three m5.large (CPU × 2, RAM × 8GB) instances in the same region, all defaults. But the issue is reproducible with drill-embedded as well.
My S3 data source config (both drill-embedded and cluster):
{ "type": "file", "connection": "s3a://bukket/", "config": { "fs.s3a.access.key": "XXXXXXXXXXXXXX9XX9XX", "fs.s3a.secret.key": "Xx/xxXxxxX9xxxXxxXXxxXxXXx9xxXXXXxXXxxx9", "fs.s3a.endpoint": "s3.us-east-1.amazonaws.com", "fs.s3a.connection.maximum": "100", "fs.s3a.connection.timeout": "10000" }, "workspaces": { "root": { "location": "/", "writable": false, "defaultInputFormat": null, "allowAccessOutsideWorkspace": false }, "tmp": { "location": "/tmp", "writable": true, "defaultInputFormat": null, "allowAccessOutsideWorkspace": false } }, "formats": { "parquet": { "type": "parquet" } }, "enabled": true }
I omitted non-parquet-related configs for CSV, JSON, and others.
When enabling debug logs and executing a query like SELECT * FROM s3.`slice/9x99x99x-99x9-99x9-x99x-999x999xxxxx.parquet` LIMIT 10 (slice is a "directory" name inside S3 bucket) first I see the data from Parquet file flowing through the wire, but then Drill tries to get byte ranges with HTTP Range header:
DEBUG o.a.h.i.conn.DefaultClientConnection - Sending request: GET /slice/9x99x99x-99x9-99x9-x99x-999x999xxxxx.parquet HTTP/1.1 DEBUG org.apache.http.wire - >> "GET /slice/9x99x99x-99x9-99x9-x99x-999x999xxxxx.parquet HTTP/1.1[\r][\n]" DEBUG org.apache.http.wire - >> "Host: bukket.s3.us-east-1.amazonaws.com[\r][\n]" DEBUG org.apache.http.wire - >> "Authorization: AWS XXXXXXXXXXXXXX9XX9XX:xxXxXxXxXx9xXXXXxxXXxX9xxXx=[\r][\n]" DEBUG org.apache.http.wire - >> "User-Agent: aws-sdk-java/1.7.4 Linux/4.17.19-1-MANJARO OpenJDK_64-Bit_Server_VM/25.181-b13/1.8.0_181[\r][\n]" DEBUG org.apache.http.wire - >> "Range: bytes=2102-98296[\r][\n]" DEBUG org.apache.http.wire - >> "Date: Tue, 04 Sep 2018 23:30:10 GMT[\r][\n]" DEBUG org.apache.http.wire - >> "Content-Type: application/x-www-form-urlencoded; charset=utf-8[\r][\n]" DEBUG org.apache.http.wire - >> "Connection: Keep-Alive[\r][\n]" DEBUG org.apache.http.wire - >> "[\r][\n]" DEBUG org.apache.http.headers - >> GET /slice/9x99x99x-99x9-99x9-x99x-999x999xxxxx.parquet HTTP/1.1 DEBUG org.apache.http.headers - >> Host: bukket.s3.us-east-1.amazonaws.com DEBUG org.apache.http.headers - >> Authorization: AWS XXXXXXXXXXXXXX9XX9XX:xxXxXxXxXx9xXXXXxxXXxX9xxXx= DEBUG org.apache.http.headers - >> User-Agent: aws-sdk-java/1.7.4 Linux/4.17.19-1-MANJARO OpenJDK_64-Bit_Server_VM/25.181-b13/1.8.0_181 DEBUG org.apache.http.headers - >> Range: bytes=2102-98296 DEBUG org.apache.http.headers - >> Date: Tue, 04 Sep 2018 23:30:10 GMT DEBUG org.apache.http.headers - >> Content-Type: application/x-www-form-urlencoded; charset=utf-8 DEBUG org.apache.http.headers - >> Connection: Keep-Alive DEBUG org.apache.http.wire - << "HTTP/1.1 206 Partial Content[\r][\n]" DEBUG org.apache.http.wire - << "x-amz-id-2: Xxx9xXXx99XxXXxXX9XxxxXxxXxx9X9X99XXXxXXxXxXxxXXxXxxXxXXXXxXxXXXXxxXX9X9Xx9=[\r][\n]" DEBUG org.apache.http.wire - << "x-amz-request-id: 99X99XX999X9999X[\r][\n]" DEBUG org.apache.http.wire - << "Date: Tue, 04 Sep 2018 23:30:11 GMT[\r][\n]" DEBUG org.apache.http.wire - << "Last-Modified: Tue, 04 Sep 2018 21:44:29 GMT[\r][\n]" DEBUG org.apache.http.wire - << "ETag: "9x9xxxx999xxx999xx9999xxxx99xx99"[\r][\n]" DEBUG org.apache.http.wire - << "Accept-Ranges: bytes[\r][\n]" DEBUG org.apache.http.wire - << "Content-Range: bytes 2102-98296/98297[\r][\n]" DEBUG org.apache.http.wire - << "Content-Type: application/octet-stream[\r][\n]" DEBUG org.apache.http.wire - << "Content-Length: 96195[\r][\n]" DEBUG org.apache.http.wire - << "Server: AmazonS3[\r][\n]" DEBUG org.apache.http.wire - << "[\r][\n]" DEBUG o.a.h.i.conn.DefaultClientConnection - Receiving response: HTTP/1.1 206 Partial Content DEBUG org.apache.http.headers - << HTTP/1.1 206 Partial Content DEBUG org.apache.http.headers - << x-amz-id-2: Xxx9xXXx99XxXXxXX9XxxxXxxXxx9X9X99XXXxXXxXxXxxXXxXxxXxXXXXxXxXXXXxxXX9X9Xx9= DEBUG org.apache.http.headers - << x-amz-request-id: 99X99XX999X9999X DEBUG org.apache.http.headers - << Date: Tue, 04 Sep 2018 23:30:11 GMT DEBUG org.apache.http.headers - << Last-Modified: Tue, 04 Sep 2018 21:44:29 GMT DEBUG org.apache.http.headers - << ETag: "9x9xxxx999xxx999xx9999xxxx99xx99" DEBUG org.apache.http.headers - << Accept-Ranges: bytes DEBUG org.apache.http.headers - << Content-Range: bytes 2102-98296/98297 DEBUG org.apache.http.headers - << Content-Type: application/octet-stream DEBUG org.apache.http.headers - << Content-Length: 96195 DEBUG org.apache.http.headers - << Server: AmazonS3 DEBUG c.a.http.impl.client.SdkHttpClient - Connection can be kept alive indefinitely DEBUG com.amazonaws.request - Received successful response: 206, AWS Request ID: 99X99XX999X9999X DEBUG o.apache.hadoop.fs.s3a.S3AFileSystem - Opening '/slice/9x99x99x-99x9-99x9-x99x-999x999xxxxx.parquet' for reading. DEBUG o.apache.hadoop.fs.s3a.S3AFileSystem - Getting path status for /slice/9x99x99x-99x9-99x9-x99x-999x999xxxxx.parquet (slice/9x99x99x-99x9-99x9-x99x-999x999xxxxx.parquet)
Each consequent request's range start is only a few bytes bigger, then previous request (e.g. Range header from consequent requests is 2315-98296, 2351-98296, 2387-98296 and so on).
At some point of time, when only about 4K bytes of data is "read" (I guess that by Range header start), Drill is all out of the connections:
DEBUG com.amazonaws.http.AmazonHttpClient - Retriable error detected, will retry in 20000ms, attempt number: 8 DEBUG o.a.h.i.c.PoolingClientConnectionManager - Connection request: [route: {s}->https://bukket.s3.us-east-1.amazonaws.com][total kept alive: 0; route allocated: 100 of 100; total allocated: 100 of 100] DEBUG org.eclipse.jetty.server.session - Scavenging sessions at 1536103994766 DEBUG c.a.h.c.ClientConnectionRequestFactory - java.lang.reflect.InvocationTargetException: null at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_181] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_181] at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70) ~[aws-java-sdk-1.7.4.jar:na] at com.amazonaws.http.conn.$Proxy77.getConnection(Unknown Source) [na:na] at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:456) [httpclient-4.2.5.jar:4.2.5] at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) [httpclient-4.2.5.jar:4.2.5] at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) [httpclient-4.2.5.jar:4.2.5] at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:384) [aws-java-sdk-1.7.4.jar:na] at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) [aws-java-sdk-1.7.4.jar:na] at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) [aws-java-sdk-1.7.4.jar:na] at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:976) [aws-java-sdk-1.7.4.jar:na] at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:956) [aws-java-sdk-1.7.4.jar:na] at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:892) [hadoop-aws-2.7.1.jar:na] at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:373) [hadoop-aws-2.7.1.jar:na] at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:767) [hadoop-common-2.7.1.jar:na] at org.apache.drill.exec.store.dfs.DrillFileSystem.open(DrillFileSystem.java:134) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.parquet.hadoop.ColumnChunkIncReadStore.addColumn(ColumnChunkIncReadStore.java:246) [drill-java-exec-1.14.0.jar:1.10.0] at org.apache.drill.exec.store.parquet2.DrillParquetReader.setup(DrillParquetReader.java:249) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas(ScanBatch.java:251) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:169) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext(LimitRecordBatch.java:87) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:142) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:103) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:83) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:294) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:281) [drill-java-exec-1.14.0.jar:1.14.0] at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_181] at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_181] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) [hadoop-common-2.7.1.jar:na] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:281) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.14.0.jar:1.14.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181] Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:232) ~[httpclient-4.2.5.jar:4.2.5] at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:199) ~[httpclient-4.2.5.jar:4.2.5] ... 48 common frames omitted INFO com.amazonaws.http.AmazonHttpClient - Unable to execute HTTP request: Timeout waiting for connection from pool org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:232) ~[httpclient-4.2.5.jar:4.2.5] at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:199) ~[httpclient-4.2.5.jar:4.2.5] at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_181] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_181] at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70) ~[aws-java-sdk-1.7.4.jar:na] at com.amazonaws.http.conn.$Proxy77.getConnection(Unknown Source) ~[na:na] at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:456) ~[httpclient-4.2.5.jar:4.2.5] at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) ~[httpclient-4.2.5.jar:4.2.5] at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) ~[httpclient-4.2.5.jar:4.2.5] at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:384) [aws-java-sdk-1.7.4.jar:na] at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) [aws-java-sdk-1.7.4.jar:na] at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) [aws-java-sdk-1.7.4.jar:na] at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:976) [aws-java-sdk-1.7.4.jar:na] at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:956) [aws-java-sdk-1.7.4.jar:na] at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:892) [hadoop-aws-2.7.1.jar:na] at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:373) [hadoop-aws-2.7.1.jar:na] at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:767) [hadoop-common-2.7.1.jar:na] at org.apache.drill.exec.store.dfs.DrillFileSystem.open(DrillFileSystem.java:134) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.parquet.hadoop.ColumnChunkIncReadStore.addColumn(ColumnChunkIncReadStore.java:246) [drill-java-exec-1.14.0.jar:1.10.0] at org.apache.drill.exec.store.parquet2.DrillParquetReader.setup(DrillParquetReader.java:249) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas(ScanBatch.java:251) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:169) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext(LimitRecordBatch.java:87) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:142) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:103) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:83) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:294) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:281) [drill-java-exec-1.14.0.jar:1.14.0] at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_181] at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_181] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) [hadoop-common-2.7.1.jar:na] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:281) [drill-java-exec-1.14.0.jar:1.14.0] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.14.0.jar:1.14.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
Drill makes 10 exponential retry attempts and finally fails.
I've tried different values for fs.s3a.connection.maximum (3, 10, 100) and fs.s3a.connection.timeout (10000, 30000 60000) with the same luck (no luck).
Is that a connection leak? Can it be fixed?