[DRILL-5553] SELECT *, columns produces nonsense results - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: 1.10.0
Fix Version/s: None
Component/s: Storage - Text & CSV
Labels:
None

Description

Consider the case discussed in DRILL-5551. Create a slight variation.

Input file: CSV with headers:

a,b,c
10,foo,bar

As in DRILL-5550, CSV plugin is configured to use headers.

Run this (admittedly strange) query:

SELECT *, columns FROM `dfs.data.example.csv`

The resulting schema is:

BatchSchema [fields=[
a(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)], 
b(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)], 
c(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)], 
columns(INT:OPTIONAL) [$bits$(UINT1:REQUIRED), columns(INT:OPTIONAL)]], 
selectionVector=NONE]

To make it easier to read:

a(VARCHAR:REQUIRED), 
b(VARCHAR:REQUIRED).
c(VARCHAR:REQUIRED),
columns(INT:OPTIONAL)

In DRILL-5551, columns changes meaning from an array of columns to a blank normal column. Here, it changes meaning again to a nullable Int (our normal "placeholder" for missing columns.)

Expected:

1. That, per DRILL-5552, no other column reference can occur with "*".
2. If item 1 is not fixed, that the scanner (or text reader) forbid the use of either "*" or "columns" with other column references.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Paul Rogers

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 29/May/17 23:00

Updated:: 06/Jul/17 06:42