Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-5207

Support data encryption for Hive tables



    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 0.12.0
    • None
    • None


      For sensitive and legally protected data such as personal information, it is a common practice that the data is stored encrypted in the file system. To enable Hive with the ability to store and query the encrypted data is very crucial for Hive data analysis in enterprise.

      When creating table, user can specify whether a table is an encrypted table or not by specify a property in TBLPROPERTIES. Once an encrypted table is created, query on the encrypted table is transparent as long as the corresponding key management facilities are set in the running environment of query. We can use hadoop crypto provided by HADOOP-9331 for underlying data encryption and decryption.

      As to key management, we would support several common key management use cases. First, the table key (data key) can be stored in the Hive metastore associated with the table in properties. The table key can be explicit specified or auto generated and will be encrypted with a master key. There are cases that the data being processed is generated by other applications, we need to support externally managed or imported table keys. Also, the data generated by Hive may be consumed by other applications in the system. We need to a tool or command for exporting the table key to a java keystore for using externally.

      To handle versions of Hadoop that do not have crypto support, we can avoid compilation problems by segregating crypto API usage into separate files (shims) to be included only if a flag is defined on the Ant command line (something like –Dcrypto=true).


        1. HIVE-5207.patch
          127 kB
          Haifeng Chen
        2. HIVE-5207.patch
          127 kB
          Haifeng Chen

        Issue Links



              Unassigned Unassigned
              jerrychenhf Haifeng Chen
              2 Vote for this issue
              20 Start watching this issue



                Time Tracking

                  Original Estimate - 504h
                  Remaining Estimate - 504h
                  Time Spent - Not Specified
                  Not Specified