Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21397

BloomFilter for hive Managed [ACID] table does not work as expected

    XMLWordPrintableJSON

Details

    Description

      Steps to Reproduce this issue : 
      ----------------------------------------- 
      1. Create a HIveManaged table as below : 
      ----------------------------------------- 

      CREATE TABLE `bloomTest`( 
         `msisdn` string, 
         `imsi` varchar(20), 
         `imei` bigint, 
         `cell_id` bigint) 
       ROW FORMAT SERDE 
         'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
       STORED AS INPUTFORMAT 
         'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
       OUTPUTFORMAT 
         'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
       LOCATION 
         'hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest; 
       TBLPROPERTIES ( 
         'bucketing_version'='2', 
         'orc.bloom.filter.columns'='msisdn,cell_id,imsi', 
         'orc.bloom.filter.fpp'='0.02', 
         'transactional'='true', 
         'transactional_properties'='default', 
         'transient_lastDdlTime'='1551206683'

      ----------------------------------------- 
      2. Insert a few rows. 
      ----------------------------------------- 

      ----------------------------------------- 
      3. Check if bloom filter or active : [ It does not show bloom filters for hive managed tables ] 
      ----------------------------------------- 

      [hive@c1162-node2 root]$ hive --orcfiledump hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_0000001_0000001_0000 | grep -i bloom 
      SLF4J: Class path contains multiple SLF4J bindings. 
      SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
      SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
      SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 
      SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 
      Processing data file hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_0000001_0000001_0000/bucket_00000 [length: 791] 
      Structure for hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_0000001_0000001_0000/bucket_00000 

      ----------------------------------------- 
      On Another hand: For hive External tables it works : 
      ----------------------------------------- 

      CREATE external TABLE `ext_bloomTest`( 
         `msisdn` string, 
         `imsi` varchar(20), 
         `imei` bigint, 
         `cell_id` bigint) 
       ROW FORMAT SERDE 
         'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
       STORED AS INPUTFORMAT 
         'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
       OUTPUTFORMAT 
         'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
      
       TBLPROPERTIES ( 
         'bucketing_version'='2', 
         'orc.bloom.filter.columns'='msisdn,cell_id,imsi', 
         'orc.bloom.filter.fpp'='0.02'

      ----------------------------------------- 

      [hive@c1162-node2 root]$ hive --orcfiledump hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/000000_0 | grep -i bloom 
      SLF4J: Class path contains multiple SLF4J bindings. 
      SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
      SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
      SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 
      SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 
      Processing data file hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/000000_0 [length: 755] 
      Structure for hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/000000_0 
          Stream: column 1 section BLOOM_FILTER_UTF8 start: 41 length 110 
          Stream: column 2 section BLOOM_FILTER_UTF8 start: 178 length 114 
          Stream: column 4 section BLOOM_FILTER_UTF8 start: 340 length 109 

      Attachments

        1. HIVE-21397.patch
          17 kB
          Denys Kuzmenko
        2. HIVE-21397.5.patch
          16 kB
          Denys Kuzmenko
        3. HIVE-21397.4.patch
          16 kB
          Denys Kuzmenko
        4. HIVE-21397.3.patch
          16 kB
          Denys Kuzmenko
        5. HIVE-21397.2.patch
          17 kB
          Denys Kuzmenko
        6. HIVE-21397.1.patch
          0.9 kB
          Gopal Vijayaraghavan

        Issue Links

          Activity

            People

              dkuzmenko Denys Kuzmenko
              vaibhav_hnw vaibhav
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: