Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7934

Improve column level encryption with key management



    • Improvement
    • Status: In Progress
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None


      Now HIVE-6329 is a framework of column level encryption/decryption. But the implementation in HIVE-6329 is just use Base64, it is not safe and have some problems:

      • Base64WriteOnly just be able to get the ciphertext from client for any users.
      • Base64Rewriter just be able to get plaintext from client for any users.

      I have an improvement based on HIVE-6329 using key management via kms.
      This patch implement transparent column level encryption. Users don't need to set anything when they quey tables.

      1. setup kms and set kms-acls.xml (e.g. user1 and root has permission to get key)
            <value>user1 root</value>
              ACL for get-key-version and get-current-key operations.
      2. set hive-site.xml
      3. create an encrypted table
        drop table student_column_encrypt;
        create table student_column_encrypt (s_key INT, s_name STRING, s_country STRING, s_age INT) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
          WITH SERDEPROPERTIES ('column.encode.columns'='s_country,s_age', 'column.encode.classname'='org.apache.hadoop.hive.serde2.crypto.CryptoRewriter') 
          STORED AS TEXTFILE TBLPROPERTIES('hive.encrypt.keynames'='hive.k1');
        insert overwrite table student_column_encrypt 
          s_key, s_name, s_country, s_age
        from student;
        select * from student_column_encrypt; 
      4. query table by different user, this is transparent to users. It is very convenient and don't need to set anything.
        [root@huang1 hive_data]# hive
        hive> select * from student_column_encrypt;       
        0	Armon	China	20
        1	Jack	USA	21
        2	Lucy	England	22
        3	Lily	France	23
        4	Yom	Spain	24
        Time taken: 0.759 seconds, Fetched: 5 row(s)
        [root@huang1 hive_data]# su user2
        [user2@huang1 hive_data]$ hive
        hive> select * from student_column_encrypt;
        0	Armon	dqyb188=	NULL
        1	Jack	YJez	NULL
        2	Lucy	cKqV1c8MTw==	NULL
        3	Lily	c7aT180H	NULL
        4	Yom	ZrST0MA=	NULL
        Time taken: 0.77 seconds, Fetched: 5 row(s)


        Issue Links



              Huang Xiaomeng Xiaomeng Huang
              Huang Xiaomeng Xiaomeng Huang
              2 Vote for this issue
              5 Start watching this issue