[DRILL-6814] Query performance on S3 files - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Cannot Reproduce
Affects Version/s: 1.14.0
Fix Version/s: None
Component/s: Storage - Other
Labels:
None
Environment:

Amazon EC2 instances-
4 Linux Redhat machines -version 7.5
RAM- 32GB

Description

I have installed 4 Node drill cluster on Amazon EC2 and trying to execute a simple count on one Amazon S3 file. File type is CSV and size is approx- 14GB.
The query returns expected count after the execution of approx 30 minutes.
If we keep the same file in hdfs or create a table in postgres, execution time is relatively very less (approx 2-3 minutes).
Is it normal behavior or something can be done for S3 files to make execution time comparable ?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

S3investigate.txt
07/Jan/19 14:04
11 kB
Denys Ordynskiy

Activity

People

Assignee:: Denys Ordynskiy

Reporter:: Ashish Shukla

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 28/Oct/18 17:22

Updated:: 08/Jul/19 16:59

Resolved:: 08/Jul/19 16:59