Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2133

HBase filter doesn't unescape string values correctly

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.2
    • Fix Version/s: Impala 2.3.0
    • Component/s: None
    • Labels:
      None

      Description

      Check the following example,

      In Hbase:

      create 'hbase_table_single_quote_issue','cf' 
      
      put 'hbase_table_single_quote_issue','ROW1','cf:id','1000' 
      put 'hbase_table_single_quote_issue','ROW2','cf:id','1001' 
      
      put 'hbase_table_single_quote_issue','ROW1','cf:name',"William's" 
      put 'hbase_table_single_quote_issue','ROW2','cf:name','Richard' 
      

      In Hive:

      CREATE EXTERNAL TABLE external_tbl_single_quote_issue(key string, id string,name string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:id,cf:name") TBLPROPERTIES ("hbase.table.name" = "hbase_table_single_quote_issue"); 
      

      In Impala:

      invalidate metadata; 
      
      select * from external_tbl_single_quote_issue where name='William\'s'; 
      

      Below is the output:

      [cc1udtlhcld001.stack.qadev.corp:21000] > select * from external_tbl_single_quote_issue; 
      Query: select * from external_tbl_single_quote_issue 
      +------+------+-----------+ 
      | key | id | name | 
      +------+------+-----------+ 
      | ROW1 | 1000 | William's | 
      | ROW2 | 1001 | Richard | 
      +------+------+-----------+ 
      Fetched 2 row(s) in 0.27s 
      [cc1udtlhcld001.stack.qadev.corp:21000] > select * from external_tbl_single_quote_issue where name='William\'s'; 
      Query: select * from external_tbl_single_quote_issue where name='William\'s' 
      
      Fetched 0 row(s) in 0.17s 
      
      If you notice Impala is omitting rows from the output. Here is the explain plan for the above query,
      
      [cc1udtlhcld001.stack.qadev.corp:21000] > explain select * from external_tbl_single_quote_issue where name='William\'s';
      Query: explain select * from external_tbl_single_quote_issue where name='William\'s'
      +------------------------------------------------------------------------------------+
      | Explain String |
      +------------------------------------------------------------------------------------+
      | Estimated Per-Host Requirements: Memory=1.00GB VCores=1 |
      | WARNING: The following tables are missing relevant table and/or column statistics. |
      | default.external_tbl_single_quote_issue |
      | |
      | 01:EXCHANGE [UNPARTITIONED] |
      | | |
      | 00:SCAN HBASE [default.external_tbl_single_quote_issue] |
      | hbase filters: cf:name EQUAL 'William\'s' |
      | predicates: name = 'William\'s' |
      +------------------------------------------------------------------------------------+
      

        Attachments

          Activity

            People

            • Assignee:
              mgrund_impala_bb91 Martin Grund
              Reporter:
              mgrund_impala_bb91 Martin Grund
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: