[DRILL-6312] Enable pushing of cast expressions to the scanner for better schema discovery. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.13.0
Fix Version/s: None
Component/s: Execution - Relational Operators, Query Planning & Optimization
Labels:
None

Description

Drill is a schema less engine which tries to infer the schema from disparate sources at the read time. Currently the scanners infer the schema for each batch depending upon the data for that column in the corresponding batch. This solves many uses cases but can error out when the data is too different between batches like int and array[int] etc... (There are other cases as well but just to give one example).

There is also a mechanism to create a view by type casting the columns to appropriate type. This solves issues in some cases but fails in many other cases. This is due to the fact that cast expression is not being pushed down to the scanner but staying at the project or filter etc operators up the query plan.

This JIRA is to fix this by propagating the type information embedded in the cast function to the scanners so that scanners can cast the incoming data appropriately.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Hanumath Rao Maduri

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 07/Apr/18 15:00

Updated:: 09/Apr/18 17:02