Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3179

HBase Handler doesn't handle NULLs properly

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.9.0, 0.10.0
    • Fix Version/s: 0.11.0
    • Component/s: HBase Handler
    • Labels:
      None

      Description

      We found a quite severe issue in the HBase Handler which actually means that Hive potentially returns incorrect data if a column has NULL values in HBase (which means the cell doesn't even exist)

      In HBase Shell:

      create 'hive_hbase_test', 'test'
      put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
      put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
      put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
      put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
      

      In Hive:

      DROP TABLE IF EXISTS hive_hbase_test;
      CREATE EXTERNAL TABLE hive_hbase_test (
        id int,
        c1 string,
        c2 string,
        c3 string
      )
      STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
      WITH SERDEPROPERTIES ("hbase.columns.mapping" =
      ":key#s,test:c1#s,test:c2#s,test:c3#s")
      TBLPROPERTIES("hbase.table.name" = "hive_hbase_test");
      
      hive> select * from hive_hbase_test;
      OK
      1	c1-1	c2-1	c3-1
      2	c1-2	NULL	NULL
      
      hive> select c1 from hive_hbase_test;
      c1-1
      c1-2
      
      hive> select c1, c2 from hive_hbase_test;
      c1-1	c2-1
      c1-2	NULL
      

      So far everything is correct but now:

      hive> select c1, c2, c2 from hive_hbase_test;
      c1-1	c2-1	c2-1
      c1-2	NULL	c2-1
      

      Selecting c2 twice works the first time but the second time we
      actually get the value from the previous row.

      hive> select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
      c1-1	c3-1	c2-1	c2-1	c3-1	c3-1	c1-1
      c1-2	NULL	NULL	c2-1	c3-1	c3-1	c1-2
      

      We've narrowed this down to an early initialization of fieldsInited[fieldID] = true in LazyHBaseRow#uncheckedGetField and we'll try to provide a patch which surely needs review.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                lars_francke Lars Francke
                Reporter:
                lars_francke Lars Francke
              • Votes:
                2 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: