Details
-
New Feature
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
– Consider that you have the following dataset:
one,1
two,2
three,three
– The DDL is the following:
create table j1 (a string, b string) row format delimited fields terminated by ',';
– If you do a filter on column b like this:
select * from j1 where b!=1;
– I would expect to see the data from row 2 and 3. However you only see row 2. If you make the following change to the query you get the correct results:
select * from j1 where b!='1';
– Looking at an explain of both queries it appears that in the query without the quotes the 1 is being changed to 1.0 and then being compared to the string three incorrectly typed so the result is filtered out. I think that this should be improved with a Semantic exception when type of the column is string and the value being passed in the filter is not in quotes.
== Without Quotes
STAGE PLANS:
Stage: Stage-1
Map Reduce
Alias -> Map Operator Tree:
j1
TableScan
alias: j1
Filter Operator
predicate:
expr: (b <> 1.0)
type: boolean
Select Operator
expressions:
expr: a
type: string
expr: b
type: string
outputColumnNames: _col0, _col1
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
== With Quotes
STAGE PLANS:
Stage: Stage-1
Map Reduce
Alias -> Map Operator Tree:
j1
TableScan
alias: j1
Filter Operator
predicate:
expr: (b <> '1')
type: boolean
Select Operator
expressions:
expr: a
type: string
expr: b
type: string
outputColumnNames: _col0, _col1
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat