Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.8.0
-
None
-
None
-
Hadoop 3.0
Description
When using the new fs.s3a.experimental.input.fadvise=random mode for accessing Parquet files stored in S3, we see a significant improvement for the query performance but a slowdown on query planning. This is due to the way the metadata file is read (each chunk of 8000 bytes generates a new GET request to S3). Indicating with FSDataInputStream.setReadahead(metadata-filesize) that we will read the whole file, this behaviour is circumvented.
Attachments
Issue Links
- relates to
-
DRILL-6540 Upgrade to HADOOP-3.0 libraries
- Closed