Accumulo
  1. Accumulo
  2. ACCUMULO-958

Support pluggable encryption in walogs

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: logger
    • Labels:
      None

      Description

      There are some cases where users want encryption at rest for the walogs. It should be fairly trivial to implement it in such a way to insert a CipherOutputStream into the data path (defaulting to using a NullCipher) and then making the Cipher pluggable to users can insert the appropriate mechanisms for their use case.

      This also means swapping in CipherInputStream and putting in a check to make sure the Cipher type's match at read and write time. Possibly a versioning mechanism so people can migrate Ciphers.

      1. accumulo-958-patch.diff
        193 kB
        Michael Allen
      2. Improving-Crypto-Module-Interface-v1.2.pdf
        411 kB
        Michael Allen
      3. ACCUMULO-958-actual-changes.patch
        79 kB
        Josh Elser
      4. accumulo-958.diff
        62 kB
        Michael Allen

        Issue Links

          Activity

          Hide
          Keith Turner added a comment -

          Why do this for just walogs and not files? Would the cypherstreams handle sync/flush properly?

          Show
          Keith Turner added a comment - Why do this for just walogs and not files? Would the cypherstreams handle sync/flush properly?
          Hide
          John Vines added a comment -

          The cipherstreams simply add a layer between the write and the write to DFS, so it's all server side. It should have no impact on dfs' append functionality.

          The reason this is solely for walogs, is that doing encoding in the middle of an RFile write is it could throw off the sortedness of the RFile, kill relative key encoding, and doing individual key encryption just isn't performant. For the RFiles, we're better off looking at the codec used for the block level compression of RFiles.

          Show
          John Vines added a comment - The cipherstreams simply add a layer between the write and the write to DFS, so it's all server side. It should have no impact on dfs' append functionality. The reason this is solely for walogs, is that doing encoding in the middle of an RFile write is it could throw off the sortedness of the RFile, kill relative key encoding, and doing individual key encryption just isn't performant. For the RFiles, we're better off looking at the codec used for the block level compression of RFiles.
          Hide
          Keith Turner added a comment -

          The cipherstreams simply add a layer between the write and the write to DFS, so it's all server side. It should have no impact on dfs' append functionality

          i know it will not impact dfs's append. I was wondering if encryptions streams support flushing at arbitrary points, such that if the server dies you can decrypt everything up to your last succesful flush point.

          For the RFiles, we're better off looking at the codec used for the block level compression of RFiles.

          yeah, you would want to encrypt and compress each block. I was not suggesting otherwise. I am just trying to think about this from a user perspective. They may want to just turn on encryption for accumulo. Turning on encryption for just the writeahead logs would seem like a confusing options to offer users. The user may not care how it works with rfiles and walogs, they just want to turn on encryption of Accumulo's persisted data.

          Show
          Keith Turner added a comment - The cipherstreams simply add a layer between the write and the write to DFS, so it's all server side. It should have no impact on dfs' append functionality i know it will not impact dfs's append. I was wondering if encryptions streams support flushing at arbitrary points, such that if the server dies you can decrypt everything up to your last succesful flush point. For the RFiles, we're better off looking at the codec used for the block level compression of RFiles. yeah, you would want to encrypt and compress each block. I was not suggesting otherwise. I am just trying to think about this from a user perspective. They may want to just turn on encryption for accumulo. Turning on encryption for just the writeahead logs would seem like a confusing options to offer users. The user may not care how it works with rfiles and walogs, they just want to turn on encryption of Accumulo's persisted data.
          Hide
          Adam Fuchs added a comment -

          This is really just half of a bigger story, which is half of the encryption at rest epic for Accumulo. The other small half is RFile encryption, and the other big half is key distribution. The end goal is definitely to have users be able to just turn on encryption. We're trying to make sure that all the necessary extension points are in place before 1.5 is released.

          Show
          Adam Fuchs added a comment - This is really just half of a bigger story, which is half of the encryption at rest epic for Accumulo. The other small half is RFile encryption, and the other big half is key distribution. The end goal is definitely to have users be able to just turn on encryption. We're trying to make sure that all the necessary extension points are in place before 1.5 is released.
          Hide
          Michael Allen added a comment -

          I've implemented a fairly generic, fairly pluggable encryption streaming architecture (see org.apache.accumulo.core.security.crypto for classes and interfaces). It might change as we figure out more of our requirements in this area, which is why the primary interface is already marked deprecated, even though this is the first iteration. (Is there a better Apache-style marking for that kind of thing?)

          The encryption stream right now is only attached to the WAL logs, but could be wired up to RFiles and Map files without too much more trouble.

          Submitting patch now.

          Show
          Michael Allen added a comment - I've implemented a fairly generic, fairly pluggable encryption streaming architecture (see org.apache.accumulo.core.security.crypto for classes and interfaces). It might change as we figure out more of our requirements in this area, which is why the primary interface is already marked deprecated, even though this is the first iteration. (Is there a better Apache-style marking for that kind of thing?) The encryption stream right now is only attached to the WAL logs, but could be wired up to RFiles and Map files without too much more trouble. Submitting patch now.
          Hide
          Michael Allen added a comment -

          Diff for adding WAL log encryption

          Show
          Michael Allen added a comment - Diff for adding WAL log encryption
          Hide
          John Vines added a comment -

          I checked out the patch, and it looked great. Everything ran fine. And with the change in our walogs, it's great we were able to get this into 1.5. I made a few modification to the patch, namely some formatting, adding some deprecation due to the volatile API, and I added an increased replication to the DFS secret key for some more comfort (same thing we do wtih the !METADATA table).

          It is set up with a NullCipher by default, so there should be no issues with the logger crypto being an obstruction to debugging.

          And the crypto is in the client package and no server because it should be modular enough to be utilized in the modification to the RFile we would like to see done for 1.6

          Show
          John Vines added a comment - I checked out the patch, and it looked great. Everything ran fine. And with the change in our walogs, it's great we were able to get this into 1.5. I made a few modification to the patch, namely some formatting, adding some deprecation due to the volatile API, and I added an increased replication to the DFS secret key for some more comfort (same thing we do wtih the !METADATA table). It is set up with a NullCipher by default, so there should be no issues with the logger crypto being an obstruction to debugging. And the crypto is in the client package and no server because it should be modular enough to be utilized in the modification to the RFile we would like to see done for 1.6
          Hide
          John Vines added a comment -

          Also, added Michael Allen as a contributor for this patch. Good work!

          Show
          John Vines added a comment - Also, added Michael Allen as a contributor for this patch. Good work!
          Hide
          Josh Elser added a comment -

          That was.... a fast patch application.

          Show
          Josh Elser added a comment - That was.... a fast patch application.
          Hide
          John Vines added a comment -

          I had seen an earlier diff

          Show
          John Vines added a comment - I had seen an earlier diff
          Hide
          Josh Elser added a comment -

          I meant more for the rest of us to review the changes. I attached the actual changes you committed for others to more easily review.

          Show
          Josh Elser added a comment - I meant more for the rest of us to review the changes. I attached the actual changes you committed for others to more easily review.
          Hide
          John Vines added a comment -

          I was still under the impression that a ticket owner was responsible for reviewing and applying contributor patches. Did we change our methodology for this?

          Show
          John Vines added a comment - I was still under the impression that a ticket owner was responsible for reviewing and applying contributor patches. Did we change our methodology for this?
          Hide
          John Vines added a comment -

          Christopher Tubbs pointed out that there is an experimental tag which would be more appropriate here then deprecation. It should be switched to that.

          Show
          John Vines added a comment - Christopher Tubbs pointed out that there is an experimental tag which would be more appropriate here then deprecation. It should be switched to that.
          Hide
          Josh Elser added a comment -

          For brand new features, it's my opinion that it's respectful to give more than 3 hrs to give some amount of review time if anyone so desires.

          Show
          Josh Elser added a comment - For brand new features, it's my opinion that it's respectful to give more than 3 hrs to give some amount of review time if anyone so desires.
          Hide
          John Vines added a comment -

          For future record then, how long is appropriate? Because then it's worth mentioning for any future feature freezes that patches need to be in X time earlier to be brought it.

          Show
          John Vines added a comment - For future record then, how long is appropriate? Because then it's worth mentioning for any future feature freezes that patches need to be in X time earlier to be brought it.
          Hide
          Josh Elser added a comment -

          I don't think there's a single answer to that, but it depends on the complexity of the feature. I'm being snarky because after the feature freeze was extended a week, we finally see the patch. Your point about it being in X time earlier is what I'm trying to point out.

          We still have the ability to review this even after the feature freeze happens, it's just frustrating from my point of view in generating the best 1.5.0 candidate possible (we tend to go through x.y.0 releases pretty darn quick).

          Show
          Josh Elser added a comment - I don't think there's a single answer to that, but it depends on the complexity of the feature. I'm being snarky because after the feature freeze was extended a week, we finally see the patch. Your point about it being in X time earlier is what I'm trying to point out. We still have the ability to review this even after the feature freeze happens, it's just frustrating from my point of view in generating the best 1.5.0 candidate possible (we tend to go through x.y.0 releases pretty darn quick).
          Hide
          John Vines added a comment -

          yes, but we get stuck on x.y.* for a year or so, so it does become a race to get all the features you want to see in the next year.

          Show
          John Vines added a comment - yes, but we get stuck on x.y.* for a year or so, so it does become a race to get all the features you want to see in the next year.
          Hide
          Hudson added a comment -

          Integrated in Accumulo-Trunk-Hadoop-2.0 #31 (See https://builds.apache.org/job/Accumulo-Trunk-Hadoop-2.0/31/)
          ACCUMULO-958 - Adding Michael's patch. I did some formatting checks and added an increased default replication for the dfs secret key implementation he provided. I also did some additional code suppression due to the API provided being volatiile. (Revision 1438786)

          Result = SUCCESS
          vines :
          Files :

          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/AccumuloConfiguration.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/ConfigSanityCheck.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoModule.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoModuleFactory.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultCryptoModule.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultCryptoModuleUtils.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultSecretKeyEncryptionStrategy.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/SecretKeyEncryptionStrategy.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/SecretKeyEncryptionStrategyContext.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/log/DfsLogger.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/log/LogSorter.java
          Show
          Hudson added a comment - Integrated in Accumulo-Trunk-Hadoop-2.0 #31 (See https://builds.apache.org/job/Accumulo-Trunk-Hadoop-2.0/31/ ) ACCUMULO-958 - Adding Michael's patch. I did some formatting checks and added an increased default replication for the dfs secret key implementation he provided. I also did some additional code suppression due to the API provided being volatiile. (Revision 1438786) Result = SUCCESS vines : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/AccumuloConfiguration.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/ConfigSanityCheck.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoModule.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoModuleFactory.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultCryptoModule.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultCryptoModuleUtils.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultSecretKeyEncryptionStrategy.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/SecretKeyEncryptionStrategy.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/SecretKeyEncryptionStrategyContext.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/log/DfsLogger.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/log/LogSorter.java
          Hide
          Hudson added a comment -

          Integrated in Accumulo-Trunk #673 (See https://builds.apache.org/job/Accumulo-Trunk/673/)
          ACCUMULO-958 - Adding Michael's patch. I did some formatting checks and added an increased default replication for the dfs secret key implementation he provided. I also did some additional code suppression due to the API provided being volatiile. (Revision 1438786)

          Result = SUCCESS
          vines :
          Files :

          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/AccumuloConfiguration.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/ConfigSanityCheck.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoModule.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoModuleFactory.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultCryptoModule.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultCryptoModuleUtils.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultSecretKeyEncryptionStrategy.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/SecretKeyEncryptionStrategy.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/SecretKeyEncryptionStrategyContext.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/log/DfsLogger.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/log/LogSorter.java
          Show
          Hudson added a comment - Integrated in Accumulo-Trunk #673 (See https://builds.apache.org/job/Accumulo-Trunk/673/ ) ACCUMULO-958 - Adding Michael's patch. I did some formatting checks and added an increased default replication for the dfs secret key implementation he provided. I also did some additional code suppression due to the API provided being volatiile. (Revision 1438786) Result = SUCCESS vines : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/AccumuloConfiguration.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/ConfigSanityCheck.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoModule.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/CryptoModuleFactory.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultCryptoModule.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultCryptoModuleUtils.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/DefaultSecretKeyEncryptionStrategy.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/SecretKeyEncryptionStrategy.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/crypto/SecretKeyEncryptionStrategyContext.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/log/DfsLogger.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/log/LogSorter.java
          Hide
          Josh Elser added a comment -

          A few notes:

          • Would be great to see some sort of test cases, even illustrating trivial cases
          • Javadoc on SecretKeyEncryptionStrategy would be nice
          • Move the log.debug to trace (or just remove) for things that are no outwardly useful to a user/administrator (e.g. DefaultCryptoModule:62, CryptoModuleFactory:85, DfsLogger:223, etc)
          • DefaultCryptoModule:52 comment probably isn't necessary
          • Please nuke the ALL CAPS spam in CryptoModuleFactory:[63,79,88,93,118,134,143,147], if we want the user to recognize and forcibly take action, we should propagate an Exception up instead of falling back to the Null encryption classes.
          • CryptoModuleFactory:123-131 could be replaced with:
            boolean implementsSecretKeyStrategy = SecretKeyEncryptionStrategy.class.isAssignableFrom(keyEncryptionStrategyClazz);
            
          • I won't really go deep into thoughts on CryptoModule because of your deprecation comments, but I do want to say that there's probably some easy encapsulation you can do easily with the crypto opts, params and properties.
          • No argument checking in DefaultCryptoModule (NPE easy to arise). Again, tests here would be good.
          Show
          Josh Elser added a comment - A few notes: Would be great to see some sort of test cases, even illustrating trivial cases Javadoc on SecretKeyEncryptionStrategy would be nice Move the log.debug to trace (or just remove) for things that are no outwardly useful to a user/administrator (e.g. DefaultCryptoModule:62, CryptoModuleFactory:85, DfsLogger:223, etc) DefaultCryptoModule:52 comment probably isn't necessary Please nuke the ALL CAPS spam in CryptoModuleFactory: [63,79,88,93,118,134,143,147] , if we want the user to recognize and forcibly take action, we should propagate an Exception up instead of falling back to the Null encryption classes. CryptoModuleFactory:123-131 could be replaced with: boolean implementsSecretKeyStrategy = SecretKeyEncryptionStrategy.class.isAssignableFrom(keyEncryptionStrategyClazz); I won't really go deep into thoughts on CryptoModule because of your deprecation comments, but I do want to say that there's probably some easy encapsulation you can do easily with the crypto opts, params and properties. No argument checking in DefaultCryptoModule (NPE easy to arise). Again, tests here would be good.
          Hide
          Dave Marion added a comment -

          I agree with Josh that if an exception is thrown during encryption, then we should propagate it up the stack. If a user has a requirement to encrypt the data, then we should not silently fail to encrypt the data.

          Show
          Dave Marion added a comment - I agree with Josh that if an exception is thrown during encryption, then we should propagate it up the stack. If a user has a requirement to encrypt the data, then we should not silently fail to encrypt the data.
          Hide
          Eric Newton added a comment -

          Isn't there already an extension mechanism for FileSystem? Replace the FileSystem with a new kind of FileSystem which will de/encrypt data as it is read or written.

          Show
          Eric Newton added a comment - Isn't there already an extension mechanism for FileSystem? Replace the FileSystem with a new kind of FileSystem which will de/encrypt data as it is read or written.
          Hide
          Michael Allen added a comment -

          Hi all, I'll take on getting this code up to snuff and responding to everyone's thoughtful comments. The areas I'll be concentrating on besides basic cleanups will be in the tidiness around the parameters and in getting some basic tests completed. Thanks again for the thoughts, we'll get this in better shape before release.

          Show
          Michael Allen added a comment - Hi all, I'll take on getting this code up to snuff and responding to everyone's thoughtful comments. The areas I'll be concentrating on besides basic cleanups will be in the tidiness around the parameters and in getting some basic tests completed. Thanks again for the thoughts, we'll get this in better shape before release.
          Hide
          Christopher Tubbs added a comment - - edited

          If this is going to make it in to the existing code (and stay), however polished it will be by the next release, I'd like to see it clearly marked as experimental, until it is available as a complete and coherent framework for encrypting table contents.

          So, I suggest moving the relevant classes into an "experimental" sub-package, and minimizing references to them in other code. I looked for a built-in "@Experimental" annotation, but couldn't find one, so we could create one for this sort of thing (but I still prefer the sub-package until it is no longer experimental). I do not think that they should be marked as "@Deprecated" because that implies a completely different point in the life cycle of the code (in fact, it implies the opposite end of that life cycle).

          That said, what exactly are the next actions, and the timeline for polishing this feature? From the previous comment, I gather "tests", and "tidiness" (which I interpret to mean QA refactorings, but not functional changes that incorporate critical feedback). Are there more anticipated actions that I've overlooked?

          Show
          Christopher Tubbs added a comment - - edited If this is going to make it in to the existing code (and stay), however polished it will be by the next release, I'd like to see it clearly marked as experimental, until it is available as a complete and coherent framework for encrypting table contents. So, I suggest moving the relevant classes into an "experimental" sub-package, and minimizing references to them in other code. I looked for a built-in "@Experimental" annotation, but couldn't find one, so we could create one for this sort of thing (but I still prefer the sub-package until it is no longer experimental). I do not think that they should be marked as "@Deprecated" because that implies a completely different point in the life cycle of the code (in fact, it implies the opposite end of that life cycle). That said, what exactly are the next actions, and the timeline for polishing this feature? From the previous comment, I gather "tests", and "tidiness" (which I interpret to mean QA refactorings, but not functional changes that incorporate critical feedback). Are there more anticipated actions that I've overlooked?
          Hide
          Michael Allen added a comment -

          Chris, I'm hoping to have this wrapped up by end of next week. I'll post up a list of actions soon; right now I'm working on a proposal for RFile format changes to support encryption.

          Show
          Michael Allen added a comment - Chris, I'm hoping to have this wrapped up by end of next week. I'll post up a list of actions soon; right now I'm working on a proposal for RFile format changes to support encryption.
          Hide
          Michael Allen added a comment -

          Oh, and speaking of: when you say an "experimental" package, are you thinking a subpackage within the core.security area where it rests right now, or a completely different top level org.apache.accumulo.experimental package? I don't have a particular preference.

          Show
          Michael Allen added a comment - Oh, and speaking of: when you say an "experimental" package, are you thinking a subpackage within the core.security area where it rests right now, or a completely different top level org.apache.accumulo.experimental package? I don't have a particular preference.
          Hide
          Hudson added a comment -

          Integrated in Accumulo-Trunk-Hadoop-2.0 #56 (See https://builds.apache.org/job/Accumulo-Trunk-Hadoop-2.0/56/)
          ACCUMULO-958 marked incomplete WAL encrpytion feature as experimental so it will not show up in docs (Revision 1441504)

          Result = SUCCESS
          kturner :
          Files :

          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/DefaultConfiguration.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java
          Show
          Hudson added a comment - Integrated in Accumulo-Trunk-Hadoop-2.0 #56 (See https://builds.apache.org/job/Accumulo-Trunk-Hadoop-2.0/56/ ) ACCUMULO-958 marked incomplete WAL encrpytion feature as experimental so it will not show up in docs (Revision 1441504) Result = SUCCESS kturner : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/DefaultConfiguration.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java
          Hide
          Hudson added a comment -

          Integrated in Accumulo-Trunk #698 (See https://builds.apache.org/job/Accumulo-Trunk/698/)
          ACCUMULO-958 marked incomplete WAL encrpytion feature as experimental so it will not show up in docs (Revision 1441504)

          Result = SUCCESS
          kturner :
          Files :

          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/DefaultConfiguration.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java
          Show
          Hudson added a comment - Integrated in Accumulo-Trunk #698 (See https://builds.apache.org/job/Accumulo-Trunk/698/ ) ACCUMULO-958 marked incomplete WAL encrpytion feature as experimental so it will not show up in docs (Revision 1441504) Result = SUCCESS kturner : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/DefaultConfiguration.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java
          Hide
          Hudson added a comment -

          Integrated in Accumulo-Trunk-Hadoop-2.0 #57 (See https://builds.apache.org/job/Accumulo-Trunk-Hadoop-2.0/57/)
          ACCUMULO-958 fixed spelling, hid experimental options in shell (Revision 1441559)

          Result = SUCCESS
          kturner :
          Files :

          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/DefaultConfiguration.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/ConfigCommand.java
          Show
          Hudson added a comment - Integrated in Accumulo-Trunk-Hadoop-2.0 #57 (See https://builds.apache.org/job/Accumulo-Trunk-Hadoop-2.0/57/ ) ACCUMULO-958 fixed spelling, hid experimental options in shell (Revision 1441559) Result = SUCCESS kturner : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/DefaultConfiguration.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/ConfigCommand.java
          Hide
          Hudson added a comment -

          Integrated in Accumulo-Trunk #699 (See https://builds.apache.org/job/Accumulo-Trunk/699/)
          ACCUMULO-958 fixed spelling, hid experimental options in shell (Revision 1441559)

          Result = SUCCESS
          kturner :
          Files :

          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/DefaultConfiguration.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/ConfigCommand.java
          Show
          Hudson added a comment - Integrated in Accumulo-Trunk #699 (See https://builds.apache.org/job/Accumulo-Trunk/699/ ) ACCUMULO-958 fixed spelling, hid experimental options in shell (Revision 1441559) Result = SUCCESS kturner : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/DefaultConfiguration.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/conf/Property.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/ConfigCommand.java
          Hide
          Christopher Tubbs added a comment -

          Oh, and speaking of: when you say an "experimental" package, are you thinking a subpackage within the core.security area where it rests right now, or a completely different top level org.apache.accumulo.experimental package?

          I was thinking something like o.a.a.core.security.experimental.

          Show
          Christopher Tubbs added a comment - Oh, and speaking of: when you say an "experimental" package, are you thinking a subpackage within the core.security area where it rests right now, or a completely different top level org.apache.accumulo.experimental package? I was thinking something like o.a.a.core.security.experimental.
          Hide
          Keith Turner added a comment -

          Something was done for 1.5, but its incomplete so moving this ticket to 1.6

          Show
          Keith Turner added a comment - Something was done for 1.5, but its incomplete so moving this ticket to 1.6
          Hide
          Michael Allen added a comment -

          Document describing how I improved the CryptoModule interface.

          Show
          Michael Allen added a comment - Document describing how I improved the CryptoModule interface.
          Hide
          Michael Allen added a comment -

          Diff vs. latest 1.5 branch

          Show
          Michael Allen added a comment - Diff vs. latest 1.5 branch
          Hide
          Michael Allen added a comment -

          Hi everyone. I have submitted two new files (attached) in order to address the comments previously raised. In going over the various classes and interfaces, I found some siginifcant simplifications to make that lead to cleaner code overall. In the attached PDF, I explain my reasoning for sticking with the current CryptoModule approach and not using the compression codec subsystem. The attached diff has a working implementation of the proposed changes, including unit tests against both the low level crypto code and the RFile integration.

          Thoughts are welcomed.

          Show
          Michael Allen added a comment - Hi everyone. I have submitted two new files (attached) in order to address the comments previously raised. In going over the various classes and interfaces, I found some siginifcant simplifications to make that lead to cleaner code overall. In the attached PDF, I explain my reasoning for sticking with the current CryptoModule approach and not using the compression codec subsystem. The attached diff has a working implementation of the proposed changes, including unit tests against both the low level crypto code and the RFile integration. Thoughts are welcomed.
          Hide
          Keith Turner added a comment -

          Since the interface has changed in this new patch, should we revert the first patch for 1.5?

          Show
          Keith Turner added a comment - Since the interface has changed in this new patch, should we revert the first patch for 1.5?
          Hide
          Michael Allen added a comment -

          Since I doubt anyone would use the feature in that state (I don't think we had even made it into the RFiles at that point), I don't see much harm in reverting it. My sincere hope is that we can find a way to get these improvements in soon, though, in whatever vehicle makes sense.

          Show
          Michael Allen added a comment - Since I doubt anyone would use the feature in that state (I don't think we had even made it into the RFiles at that point), I don't see much harm in reverting it. My sincere hope is that we can find a way to get these improvements in soon, though, in whatever vehicle makes sense.
          Hide
          John Vines added a comment -

          I think that makes sense as well

          Show
          John Vines added a comment - I think that makes sense as well
          Hide
          Keith Turner added a comment -

          I will take a look at reverting in 1.5. We can work on getting this in for 1.6. I want to give the new patch a thorough review, I will do that in the next few days. I took a quick glance at and I think I noticed a few chunks of commented out code being added. I would recommend dropping those.

          Show
          Keith Turner added a comment - I will take a look at reverting in 1.5. We can work on getting this in for 1.6. I want to give the new patch a thorough review, I will do that in the next few days. I took a quick glance at and I think I noticed a few chunks of commented out code being added. I would recommend dropping those.
          Hide
          Michael Allen added a comment -

          Feh, I thought I caught all those, I will take a peek.

          Show
          Michael Allen added a comment - Feh, I thought I caught all those, I will take a peek.
          Hide
          Billie Rinaldi added a comment -

          Michael, if the patch available for HADOOP-9333 is accepted, would that change your approach (putting aside the question of which Hadoop versions will have it)?

          Show
          Billie Rinaldi added a comment - Michael, if the patch available for HADOOP-9333 is accepted, would that change your approach (putting aside the question of which Hadoop versions will have it)?
          Hide
          Keith Turner added a comment -

          Interesting find Billie. Since HADOOP-9333 is in the development stage, we have a chance to influence it if it does not meet our needs for some reason.

          Show
          Keith Turner added a comment - Interesting find Billie. Since HADOOP-9333 is in the development stage, we have a chance to influence it if it does not meet our needs for some reason.
          Hide
          Michael Allen added a comment -

          There's a lot to like about the approach in HADOOP-9333. They did run into the same set of issues I outlined in terms of the codec framework needing additional parameterization that it did not have. Their design doc is here, BTW, for easy retrieval and browsing: https://issues.apache.org/jira/secure/attachment/12571116/Hadoop%20Crypto%20Design.pdf

          Their implementation is fairly directly tied to AES and RC4 as crypto algorithms, which aren't necessarily bad choices, but if someone wants to switch out one crypto provider for another, the way they approach it would require more code changes (I believe) rather than just configuration changes. I think both approaches (theirs and mine) require code changes to the points wanting to employ the encryption, so I think that's pretty much a wash.

          Their solution does probe deeper into how one would be encrypting M/R output than my current patch does, but of course my current patch is more Accumulo and less Hadoop focused.

          I don't know if I would toss out my patch entirely in favor of theirs, especially given that the distribution of it probably won't catch up to all the Hadoop installations out there for quite some time.

          Show
          Michael Allen added a comment - There's a lot to like about the approach in HADOOP-9333 . They did run into the same set of issues I outlined in terms of the codec framework needing additional parameterization that it did not have. Their design doc is here, BTW, for easy retrieval and browsing: https://issues.apache.org/jira/secure/attachment/12571116/Hadoop%20Crypto%20Design.pdf Their implementation is fairly directly tied to AES and RC4 as crypto algorithms, which aren't necessarily bad choices, but if someone wants to switch out one crypto provider for another, the way they approach it would require more code changes (I believe) rather than just configuration changes. I think both approaches (theirs and mine) require code changes to the points wanting to employ the encryption, so I think that's pretty much a wash. Their solution does probe deeper into how one would be encrypting M/R output than my current patch does, but of course my current patch is more Accumulo and less Hadoop focused. I don't know if I would toss out my patch entirely in favor of theirs, especially given that the distribution of it probably won't catch up to all the Hadoop installations out there for quite some time.
          Hide
          Billie Rinaldi added a comment -

          Are you doing encryption before or after compression? It seems like that could be important. Presumably the data will compress better before encryption.

          Show
          Billie Rinaldi added a comment - Are you doing encryption before or after compression? It seems like that could be important. Presumably the data will compress better before encryption.
          Hide
          Michael Allen added a comment -

          The encryption happens after the compression in my current patch. And yeah, encrypted data should compress very very badly, by design.

          Show
          Michael Allen added a comment - The encryption happens after the compression in my current patch. And yeah, encrypted data should compress very very badly, by design.
          Hide
          John Vines added a comment -

          Resolved with patch under ACCUMULO-998

          Show
          John Vines added a comment - Resolved with patch under ACCUMULO-998

            People

            • Assignee:
              Michael Allen
              Reporter:
              John Vines
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development