[DRILL-4279] Improve performance for skipAll query against Text/JSON/Parquet table - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.6.0
Component/s: Query Planning & Optimization
Labels:
None

Description

When query does not specify any specific column to be returned SCAN, for instance,

Q1:  select count(*) from T1;
Q2:  select 1 + 100 from T1;
Q3:  select  1.0 + random() from T1;

Drill's planner would use a ColumnList with * column, plus a SKIP_ALL mode. However, the MODE is not serialized / deserialized. This leads to two problems.
1). The EXPLAIN plan is confusing, since there is no way to different from a "SELECT * " query from this SKIP_ALL mode.
For instance,

explain plan for select count(*) from dfs.`/Users/jni/work/data/yelp/t1`;
00-03          Project($f0=[0])
00-04            Scan(groupscan=[EasyGroupScan [selectionRoot=file:/Users/jni/work/data/yelp/t1, numFiles=2, columns=[`*`], files= ...

2) If the query is to be executed distributed / parallel, the missing serialization of mode would means some Fragment is fetching all the columns, while some Fragment is skipping all the columns. That will cause execution error.

For instance, by changing slice_target to enforce the query to be executed in multiple fragments, it will hit execution error.

select count(*) from dfs.`/Users/jni/work/data/yelp/t1`;
org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: Error parsing JSON - You tried to start when you are using a ValueWriter of type NullableBitWriterImpl.

Directory "t1" just contains two yelp JSON files.

Ideally, I think when no columns is required from SCAN, the explain plan should show an empty of column list. The MODE of SKIP_ALL together with star * column seems to be confusing and error prone.

Attachments

Issue Links

links to

GitHub Pull Request #328

GitHub Pull Request #342

Activity

People

Assignee:: Jinfeng Ni

Reporter:: Jinfeng Ni

Reviewer:: dgu-atmapr

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 18/Jan/16 05:16

Updated:: 03/Oct/16 22:03

Resolved:: 03/Feb/16 01:36