Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6997

Semantic Exception for string comparison filter

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      – Consider that you have the following dataset:

      one,1
      two,2
      three,three

      – The DDL is the following:

      create table j1 (a string, b string) row format delimited fields terminated by ',';

      – If you do a filter on column b like this:

      select * from j1 where b!=1;

      – I would expect to see the data from row 2 and 3. However you only see row 2. If you make the following change to the query you get the correct results:

      select * from j1 where b!='1';

      – Looking at an explain of both queries it appears that in the query without the quotes the 1 is being changed to 1.0 and then being compared to the string three incorrectly typed so the result is filtered out. I think that this should be improved with a Semantic exception when type of the column is string and the value being passed in the filter is not in quotes.

      == Without Quotes

      STAGE PLANS:
      Stage: Stage-1
      Map Reduce
      Alias -> Map Operator Tree:
      j1
      TableScan
      alias: j1
      Filter Operator
      predicate:
      expr: (b <> 1.0)
      type: boolean
      Select Operator
      expressions:
      expr: a
      type: string
      expr: b
      type: string
      outputColumnNames: _col0, _col1
      File Output Operator
      compressed: false
      GlobalTableId: 0
      table:
      input format: org.apache.hadoop.mapred.TextInputFormat
      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

      == With Quotes

      STAGE PLANS:
      Stage: Stage-1
      Map Reduce
      Alias -> Map Operator Tree:
      j1
      TableScan
      alias: j1
      Filter Operator
      predicate:
      expr: (b <> '1')
      type: boolean
      Select Operator
      expressions:
      expr: a
      type: string
      expr: b
      type: string
      outputColumnNames: _col0, _col1
      File Output Operator
      compressed: false
      GlobalTableId: 0
      table:
      input format: org.apache.hadoop.mapred.TextInputFormat
      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

      Attachments

        Activity

          People

            Unassigned Unassigned
            johndee Johndee Burks
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: