Hadoop Common
  1. Hadoop Common
  2. HADOOP-10150 Hadoop cryptographic file system
  3. HADOOP-10853

Refactor get instance of CryptoCodec and support create via algorithm/mode/padding.

    Details

      Description

      We should be able to create instance of CryptoCodec:

      • via codec class name. (Applications may have config for different crypto codecs)
      • via algorithm/mode/padding. (For automatically decryption, we need to find correct crypto codec and proper implementation)
      • a default crypto codec through specific config.

      This JIRA is for

      • Create instance through cipher suite(algorithm/mode/padding)
      • Refactor create instance of CryptoCodec into CryptoCodecFactory

      We need to get all crypto codecs in system, this can be done via a Java ServiceLoader + hadoop.security.crypto.codecs config.

      1. HADOOP-10853.001.patch
        39 kB
        Yi Liu
      2. HADOOP-10853.002.patch
        12 kB
        Yi Liu
      3. HADOOP-10853.003.patch
        13 kB
        Yi Liu
      4. HADOOP-10853.004.patch
        14 kB
        Yi Liu

        Activity

        Yi Liu created issue -
        Yi Liu made changes -
        Field Original Value New Value
        Fix Version/s 3.0.0 [ 12320357 ]
        Hide
        Yi Liu added a comment -

        Upload the patch.
        1. hadoop.security.crypto.codecs + a Java ServiceLoader are used to get all crypto codecs in system.
        2. hadoop.security.crypto.cipher.suite + hadoop.security.crypto.codec.class are for default crypto codec.
        3. When creating instance using algorithm/mode/padding, there may be several implementations, we should get proper implementation, default impl types are defined in hadoop.security.crypto.codec.impl.type.

        Show
        Yi Liu added a comment - Upload the patch. 1. hadoop.security.crypto.codecs + a Java ServiceLoader are used to get all crypto codecs in system. 2. hadoop.security.crypto.cipher.suite + hadoop.security.crypto.codec.class are for default crypto codec. 3. When creating instance using algorithm/mode/padding , there may be several implementations, we should get proper implementation, default impl types are defined in hadoop.security.crypto.codec.impl.type .
        Yi Liu made changes -
        Attachment HADOOP-10853.001.patch [ 12656238 ]
        Hide
        Yi Liu added a comment -

        Have some discussion with Colin Patrick McCabe offline. We decide to change the design a bit, we will configure codec classes for every cipher suite(algorithm/mode/padding), and don’t add CryptoCodecFactory. The configuration properties are:

        hadoop.security.crypto.cipher.suite
        hadoop.security.crypto.codec.class.aes.ctr.nopadding
        …
        

        In this way, user can specify algorithm/mode/padding in config value hadoop.security.crypto.cipher.suite for encryption/decryption. CryptoCodec#getInstance can also accept an algorithm/mode/padding from application.

        Show
        Yi Liu added a comment - Have some discussion with Colin Patrick McCabe offline. We decide to change the design a bit, we will configure codec classes for every cipher suite(algorithm/mode/padding), and don’t add CryptoCodecFactory . The configuration properties are: hadoop.security.crypto.cipher.suite hadoop.security.crypto.codec.class.aes.ctr.nopadding … In this way, user can specify algorithm/mode/padding in config value hadoop.security.crypto.cipher.suite for encryption/decryption. CryptoCodec#getInstance can also accept an algorithm/mode/padding from application.
        Yi Liu made changes -
        Attachment HADOOP-10853.002.patch [ 12656460 ]
        Yi Liu made changes -
        Summary Refactor create instance of CryptoCodec and add CryptoCodecFactory Refactor get instance of CryptoCodec and support create via algorithm/mode/padding.
        Yi Liu made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        Yi Liu made changes -
        Attachment HADOOP-10853.002.patch [ 12656496 ]
        Yi Liu made changes -
        Attachment HADOOP-10853.002.patch [ 12656460 ]
        Hide
        Colin Patrick McCabe added a comment -

        Thanks, this looks a lot better.

          public static CryptoCodec getInstance(Configuration conf, 
              String transformation) {
        

        Shouldn't this just take a cipher suite object? Taking a string just seems confusing since then the caller has to figure out what is expected.

        It would also be nice to standardize on calling these things "transformations" or "cipher suites" but not both. I had to look it up to make sure they were the same thing (they are, right?)

          private static String getCodecConfigName(String transformation) {
        

        Seems like this should be CipherSuite#getConfigSuffix or something. Again, nice to move away from manipulating strings, and more towards methods on objects and enums.

            try {
              random = (provider != null) ? 
                  SecureRandom.getInstance(secureRandomAlg, provider) : 
                    SecureRandom.getInstance(secureRandomAlg);
            } catch (GeneralSecurityException e) {
              LOG.warn(e.getMessage());
              random = new SecureRandom();
            }
        

        It would be nice to handle this fallback logic inside SecureRandom#getInstance. If we want a SecureRandom in any other place, we don't want to duplicate this code.

        Show
        Colin Patrick McCabe added a comment - Thanks, this looks a lot better. public static CryptoCodec getInstance(Configuration conf, String transformation) { Shouldn't this just take a cipher suite object? Taking a string just seems confusing since then the caller has to figure out what is expected. It would also be nice to standardize on calling these things "transformations" or "cipher suites" but not both. I had to look it up to make sure they were the same thing (they are, right?) private static String getCodecConfigName( String transformation) { Seems like this should be CipherSuite#getConfigSuffix or something. Again, nice to move away from manipulating strings, and more towards methods on objects and enums. try { random = (provider != null ) ? SecureRandom.getInstance(secureRandomAlg, provider) : SecureRandom.getInstance(secureRandomAlg); } catch (GeneralSecurityException e) { LOG.warn(e.getMessage()); random = new SecureRandom(); } It would be nice to handle this fallback logic inside SecureRandom#getInstance . If we want a SecureRandom in any other place, we don't want to duplicate this code.
        Hide
        Colin Patrick McCabe added a comment -

        It would be nice to handle this fallback logic inside SecureRandom#getInstance. If we want a SecureRandom in any other place, we don't want to duplicate this code.

        I talked to Yi about this offline and he pointed out to me that this is JCE (Java's) SecureRandom, not our {{Random}] subclass. So we can't add anything to it. D'oh. So ignore the preceding comment

        Show
        Colin Patrick McCabe added a comment - It would be nice to handle this fallback logic inside SecureRandom#getInstance. If we want a SecureRandom in any other place, we don't want to duplicate this code. I talked to Yi about this offline and he pointed out to me that this is JCE (Java's) SecureRandom, not our {{Random}] subclass. So we can't add anything to it. D'oh. So ignore the preceding comment
        Hide
        Yi Liu added a comment - - edited

        Thanks Colin Patrick McCabe for nice review, I update the patch for your comments.

        Shouldn't this just take a cipher suite object? Taking a string just seems confusing since then the caller has to figure out what is expected.
        It would also be nice to standardize on calling these things "transformations" or "cipher suites" but not both. I had to look it up to make sure they were the same thing (they are, right?)

        Right, they are the same thing. Now we just take cipher suite object as parameter and standardize on calling them “cipher suites”. I originally kept “transformation(it equaled cipher suite name)”, since in JCE Cipher it used transformation for algorithm/mode/padding.

        Seems like this should be CipherSuite#getConfigSuffix or something. Again, nice to move away from manipulating strings, and more towards methods on objects and enums.

        OK, update it.

        Show
        Yi Liu added a comment - - edited Thanks Colin Patrick McCabe for nice review, I update the patch for your comments. Shouldn't this just take a cipher suite object? Taking a string just seems confusing since then the caller has to figure out what is expected. It would also be nice to standardize on calling these things "transformations" or "cipher suites" but not both. I had to look it up to make sure they were the same thing (they are, right?) Right, they are the same thing. Now we just take cipher suite object as parameter and standardize on calling them “cipher suites”. I originally kept “transformation(it equaled cipher suite name)”, since in JCE Cipher it used transformation for algorithm/mode/padding. Seems like this should be CipherSuite#getConfigSuffix or something. Again, nice to move away from manipulating strings, and more towards methods on objects and enums. OK, update it.
        Yi Liu made changes -
        Attachment HADOOP-10853.003.patch [ 12656808 ]
        Hide
        Uma Maheswara Rao G added a comment - - edited

        I like the idea of suffixing the algorithm/mode/padding with codec classes config.
        Also now we are dealing with multiple claases configured. So do we need to make this configuration as HADOOP_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX?

        import static org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_CRYPTO_CODEC_CLASS_PREFIX;
        

        Here we just told about the current configuration with AES/CTR/NoPadding. But where do we mention that user can change the suffix to required algorithm/mode/padding to which it supports?

         Comma-separated list of crypto codec implementations for AES/CTR/NoPadding. 
        +    The first implementation will be used if avaiable, others are fallbacks.
        

        also a typo above : avaiable --> available

        Show
        Uma Maheswara Rao G added a comment - - edited I like the idea of suffixing the algorithm/mode/padding with codec classes config. Also now we are dealing with multiple claases configured. So do we need to make this configuration as HADOOP_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX? import static org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_CRYPTO_CODEC_CLASS_PREFIX; Here we just told about the current configuration with AES/CTR/NoPadding. But where do we mention that user can change the suffix to required algorithm/mode/padding to which it supports? Comma-separated list of crypto codec implementations for AES/CTR/NoPadding. + The first implementation will be used if avaiable, others are fallbacks. also a typo above : avaiable --> available
        Hide
        Yi Liu added a comment -

        Thanks Uma Maheswara Rao G for your review. Update the patch for your comments.

        Also now we are dealing with multiple claases configured. So do we need to make this configuration as HADOOP_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX?

        Right, use CLASSES is better, even other codec classes are fallback.

        But where do we mention that user can change the suffix to required algorithm/mode/padding to which it supports?

        OK. I found an example for hdfs ha namenodes, and let’s handle it in same way in core-default.xml. Furthermore, we will add more detail in user doc of fs-encryption.

        <property>
         <name>hadoop.security.crypto.codec.classes.EXAMPLECIPHERSUITE</name>
          <value></value>
          <description>
            The prefix for a given crypto codec, contains a comma-separated
            list of implementation classes for a given crypto codec (eg EXAMPLECIPHERSUITE).
            The first implementation will be used if available, others are fallbacks.
          </description>
        </property>
        
        Show
        Yi Liu added a comment - Thanks Uma Maheswara Rao G for your review. Update the patch for your comments. Also now we are dealing with multiple claases configured. So do we need to make this configuration as HADOOP_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX? Right, use CLASSES is better, even other codec classes are fallback. But where do we mention that user can change the suffix to required algorithm/mode/padding to which it supports? OK. I found an example for hdfs ha namenodes, and let’s handle it in same way in core-default.xml. Furthermore, we will add more detail in user doc of fs-encryption. <property> <name>hadoop.security.crypto.codec.classes.EXAMPLECIPHERSUITE</name> <value></value> <description> The prefix for a given crypto codec, contains a comma-separated list of implementation classes for a given crypto codec (eg EXAMPLECIPHERSUITE). The first implementation will be used if available, others are fallbacks. </description> </property>
        Yi Liu made changes -
        Attachment HADOOP-10853.004.patch [ 12657020 ]
        Hide
        Colin Patrick McCabe added a comment -

        Thanks, Yi. +1 I will commit this tomorrow if there's no more comments

        Show
        Colin Patrick McCabe added a comment - Thanks, Yi. +1 I will commit this tomorrow if there's no more comments
        Hide
        Uma Maheswara Rao G added a comment -

        +1 from me as well. Thanks a lot, Yi for addressing the feedback.

        Show
        Uma Maheswara Rao G added a comment - +1 from me as well. Thanks a lot, Yi for addressing the feedback.
        Yi Liu made changes -
        Status In Progress [ 3 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Fix Version/s fs-encryption (HADOOP-10150 and HDFS-6134) [ 12326860 ]
        Fix Version/s 3.0.0 [ 12320357 ]
        Resolution Fixed [ 1 ]
        Hide
        Yi Liu added a comment - - edited

        Thanks Colin Patrick McCabe and Uma Maheswara Rao G.
        Colin, I committed the patch to branch since there will be conflict if HDFS-6724 gets in first.

        Show
        Yi Liu added a comment - - edited Thanks Colin Patrick McCabe and Uma Maheswara Rao G . Colin, I committed the patch to branch since there will be conflict if HDFS-6724 gets in first.
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open In Progress In Progress
        23h 47m 1 Yi Liu 18/Jul/14 08:01
        In Progress In Progress Resolved Resolved
        4d 1h 38m 1 Yi Liu 22/Jul/14 09:39

          People

          • Assignee:
            Yi Liu
            Reporter:
            Yi Liu
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development