[DRILL-5414] Issue with Querying Directories - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.10.0
Fix Version/s: None
Component/s: Functions - Drill
Labels:
None
Environment:

Kubernetes running Debian GNU/Linux 8 containers.
openjdk version "1.8.0_111".
AWS.
Using s3 buckets

Description

*Thanks for apache drill - it's pretty awesome

I'm hoping to exploit drill directory querying and have structured my data archive in s3 to test this. However, I've got an issue using directory querying.

My directory structure in s3 is like:
s3/devices_by_id/device_id/2016/11/12/<filename>.json.gz

From the documentation I figured the following queries were equivalent:

select count from `s3`.`/deviceid/xyz/2016/11/` ;
---------

EXPR$0

---------

286049

---------
1 row selected (10.351 seconds)

select count from `s3`.`/deviceid/` where dir0='xyz' and dir1='2016' and dir2='11'; But this latter query just hangs. There is no profile in the UI. I cntrl-c and get :

--
--
No rows selected (1481.727 seconds)

If I try to run an explain plan, that also hangs.

There are a total of 13283 compressed json files in the 2016/11 s3 bucket.

The log doesn't show much information.

If anyone can help with this please? I can provide more information as required. Hopefully this is not user error.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Paul Makkar

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 05/Apr/17 15:16

Updated:: 06/Dec/18 21:39