Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0
    • Component/s: security
    • Labels:
      None

      Description

      Saw an interesting article: http://www.fiercegovernmentit.com/story/sasc-accumulo-language-pro-open-source-say-proponents/2012-06-14

      "The Senate Armed Services Committee version of the fiscal 2013 national defense authorization act (S. 3254) would require DoD agencies to foreswear the Accumulo NoSQL database after Sept. 30, 2013, unless the DoD CIO certifies that there exists either no viable commercial open source database with security features comparable to [Accumulo] (such as the HBase or Cassandra databases)..."

      Not sure what a 'commercial open source database' is, and I'm not sure whats going on in the article, but tra-la-la'ing, if we had per-KeyValue 'security' like Accumulo's, we might put ourselves in the running for federal contributions?

      1. 6222.pdf
        1.30 MB
        Andrew Purtell
      2. cell-acls-kv-tags-not-for-review.zip
        27 kB
        Andrew Purtell
      3. HBaseCellRow-LevelSecurityPRD.docx
        31 kB
        Himanshu Vashishtha
      4. HBaseCellRow-LevelSecurityDesignDoc.docx
        291 kB
        Himanshu Vashishtha

        Issue Links

          Activity

          Hide
          Andrew Purtell added a comment -

          It's rather shocking to see congressional level discussion about NoSQL winners and losers.

          Show
          Andrew Purtell added a comment - It's rather shocking to see congressional level discussion about NoSQL winners and losers.
          Hide
          Lars George added a comment -

          I was thinking about this as well a while back when Accumulo came up. In that context I thought it might be good to have the option to add "tags" to each KV. This can be used for security or other purposes, like replication or transactional support. Another use case is filtering as these tags could have special features like caching or indexes.

          Show
          Lars George added a comment - I was thinking about this as well a while back when Accumulo came up. In that context I thought it might be good to have the option to add "tags" to each KV. This can be used for security or other purposes, like replication or transactional support. Another use case is filtering as these tags could have special features like caching or indexes.
          Hide
          Andrew Purtell added a comment -

          HBASE-4477 could transmit tags in the WAL.

          HBASE-2893 Is one way to manage them in the store.

          Revisit those issues?

          Others? New approaches?

          Show
          Andrew Purtell added a comment - HBASE-4477 could transmit tags in the WAL. HBASE-2893 Is one way to manage them in the store. Revisit those issues? Others? New approaches?
          Hide
          Jonathan Hsieh added a comment -

          Here's a good link with high level semantics of their implementation.

          http://accumulo.apache.org/1.4/user_manual/Security.html

          I've spent a few hours looking at the accumulo code. Internally it is:

          • an extra visibility/tags field in their cell Key which supports AND, OR, NOT boolean operators.
          • an always on visibility filter on the tablet/regionserver side.
          • a constraint policy option that determines behavior if a users attempts to write to labels they do not have read access to:
            • allow
            • throw security exceptions
          • authentication credentials and authorization plumbing
          Show
          Jonathan Hsieh added a comment - Here's a good link with high level semantics of their implementation. http://accumulo.apache.org/1.4/user_manual/Security.html I've spent a few hours looking at the accumulo code. Internally it is: an extra visibility/tags field in their cell Key which supports AND, OR, NOT boolean operators. an always on visibility filter on the tablet/regionserver side. a constraint policy option that determines behavior if a users attempts to write to labels they do not have read access to: allow throw security exceptions authentication credentials and authorization plumbing
          Hide
          Andrew Purtell added a comment -

          @Jon we could do it that way but then the changes are core . A big concern when This first came up is the impact increasing the size of every KV would have for something that won't be the common case.

          Show
          Andrew Purtell added a comment - @Jon we could do it that way but then the changes are core . A big concern when This first came up is the impact increasing the size of every KV would have for something that won't be the common case.
          Hide
          Jimmy Xiang added a comment -

          We can put some attribute on the table level to control if the feature is enabled for a table. So that in normal case, the KV is not affected.
          We can also put such information in a system table.

          Show
          Jimmy Xiang added a comment - We can put some attribute on the table level to control if the feature is enabled for a table. So that in normal case, the KV is not affected. We can also put such information in a system table.
          Hide
          Andrew Purtell added a comment -

          @Jimmy Such a proposal must address the impact on code. So is that a subclass of KV? Or a bunch of conditional code wherever KVs are handled?

          Show
          Andrew Purtell added a comment - @Jimmy Such a proposal must address the impact on code. So is that a subclass of KV? Or a bunch of conditional code wherever KVs are handled?
          Hide
          Jonathan Hsieh added a comment - - edited

          @Andy I'm not proposing we necessarily do it the way accumulo does it – the more important point is that we mimic it's semantics to make it easier for users to eventually port to HBase.

          The meta column idea you describe in HBASE-2893 and its use cases seems to also argue for restoring the distinction between locality groups and column families from original bigtable paper. (these are conflated in HBase).

          Using a meta column for cell-granularity visibility seems like a reasonable hbase implementation option. I'm not convinced why this approach would save significantly more space. I do think the not changing core argument is more compelling hbase-specific constraint.

          The accumulo implementation has a default case that would make it cost the tags cost extra only on cells with non-default behavior (e.g. only the highly-sensitive or cols with sensitivity varying cells). They delta encode their keys (including the visiblity tags) like we do in 0.94.

          As a straw man, a core-changing HBase version that modified KV's wouldn't necessarily incur more space – we could use a new KeyValue.Type (TaggedPut?) that had visibility settings, and use the old Put type that used the default tags.

          Show
          Jonathan Hsieh added a comment - - edited @Andy I'm not proposing we necessarily do it the way accumulo does it – the more important point is that we mimic it's semantics to make it easier for users to eventually port to HBase. The meta column idea you describe in HBASE-2893 and its use cases seems to also argue for restoring the distinction between locality groups and column families from original bigtable paper. (these are conflated in HBase). Using a meta column for cell-granularity visibility seems like a reasonable hbase implementation option. I'm not convinced why this approach would save significantly more space. I do think the not changing core argument is more compelling hbase-specific constraint. The accumulo implementation has a default case that would make it cost the tags cost extra only on cells with non-default behavior (e.g. only the highly-sensitive or cols with sensitivity varying cells). They delta encode their keys (including the visiblity tags) like we do in 0.94. As a straw man, a core-changing HBase version that modified KV's wouldn't necessarily incur more space – we could use a new KeyValue.Type (TaggedPut?) that had visibility settings, and use the old Put type that used the default tags.
          Hide
          Lars George added a comment -

          I am thinking of moving the tags between the key and the value of the KV. The existence would need to be tracked somehow, so we could add a single byte as a bit field to flag the presence. Or add a new type value like PUT_WITH_TAGS or some such. That would leave existing KVs the same and only the code needs to be enhanced to enable the actual security layer - which would be a system level coprocessor I presume.

          Show
          Lars George added a comment - I am thinking of moving the tags between the key and the value of the KV. The existence would need to be tracked somehow, so we could add a single byte as a bit field to flag the presence. Or add a new type value like PUT_WITH_TAGS or some such. That would leave existing KVs the same and only the code needs to be enhanced to enable the actual security layer - which would be a system level coprocessor I presume.
          Hide
          Lars George added a comment -

          Oh, and that approach would also leave all the comparators etc. untouched. That sounds like a good thing.

          Show
          Lars George added a comment - Oh, and that approach would also leave all the comparators etc. untouched. That sounds like a good thing.
          Hide
          Andrew Purtell added a comment - - edited

          Thus far we have various hooks and a proposal for an out of band approach: HBASE-2893, HBASE-4477, API object attributes (HBASE-3811, HBASE-3921, HBASE-3931, etc.). The impetus (as I understand it) for an out of band approach is the special nature of KV and the desire to avoid adding additional conditionals/complexity in code wherever it touches KV. That is balanced by the problems an out of band approach causes for minimizing overheads when tags are present, as Jon mentions, without locality groups (a significant change) there is extra IO. I see this as a more conservative approach that won't perturb performance sensitive members of our community, because there is zero overhead unless configuration is toggled on, coprocessors are loaded, and such.

          However I am not arguing for or against any particular approach. The suggestions by Jon, Lars, and Jimmy are the start of a design proposal but should be pulled together into an actual proposal (as another JIRA?). This is my point. Adding even only one byte to every KV has a very significant effect on storage and IO when the data set is large. Adding a new type avoids that but what is the impact in the code? And it wouldn't be one new type right, there would be a tagged analogue for every mutator? A convincing exploration of these issues would go a long way here IMHO.

          Show
          Andrew Purtell added a comment - - edited Thus far we have various hooks and a proposal for an out of band approach: HBASE-2893 , HBASE-4477 , API object attributes ( HBASE-3811 , HBASE-3921 , HBASE-3931 , etc.). The impetus (as I understand it) for an out of band approach is the special nature of KV and the desire to avoid adding additional conditionals/complexity in code wherever it touches KV. That is balanced by the problems an out of band approach causes for minimizing overheads when tags are present, as Jon mentions, without locality groups (a significant change) there is extra IO. I see this as a more conservative approach that won't perturb performance sensitive members of our community, because there is zero overhead unless configuration is toggled on, coprocessors are loaded, and such. However I am not arguing for or against any particular approach. The suggestions by Jon, Lars, and Jimmy are the start of a design proposal but should be pulled together into an actual proposal (as another JIRA?). This is my point. Adding even only one byte to every KV has a very significant effect on storage and IO when the data set is large. Adding a new type avoids that but what is the impact in the code? And it wouldn't be one new type right, there would be a tagged analogue for every mutator? A convincing exploration of these issues would go a long way here IMHO.
          Hide
          Lars George added a comment -

          I agree with the need of discussing what the objective is and how to get there with the least amount of impact on existing code. That is why I like the idea of reusing the type field.

          I would only add one more type (the one mentioned) and that, together with a global switch, would trigger the code to parse the tags located after the key but before the value (or maybe after the value as it is length prefixed). That tags once parsed can be access then by the coprocessor to do the filtering etc.

          So we would need a way to attach tags to a KV as well as access them once they are parsed out. Seems reasonable?

          Show
          Lars George added a comment - I agree with the need of discussing what the objective is and how to get there with the least amount of impact on existing code. That is why I like the idea of reusing the type field. I would only add one more type (the one mentioned) and that, together with a global switch, would trigger the code to parse the tags located after the key but before the value (or maybe after the value as it is length prefixed). That tags once parsed can be access then by the coprocessor to do the filtering etc. So we would need a way to attach tags to a KV as well as access them once they are parsed out. Seems reasonable?
          Hide
          Matt Corgan added a comment -

          DataBlockEncoding gives us the opportunity to add features to a KeyValue interface but create different implementations that have the option of ignoring some of the features. For HBASE-4676 (PrefixTrie), I need to create an interface so that I can store the different parts of a KeyValue in different backing arrays: https://github.com/hotpads/hbase-prefix-trie/blob/master/src/org/apache/hadoop/hbase/cell/HCell.java

          This discussion reminds me of the memstoreTS in the current KeyValue which is more of an afterthought and has negative performance implications. With DataBlockEncoding I'm able to treat it as a first class citizen when it is needed and encode it away when it's not needed. I suppose there is a small cpu cost, but really small relative to what else is happening.

          In the future, I think we're going to want to treat the legacy KeyValue and legacy LinkedList<KeyValue> style data blocks as just another DataBlockEncoding (DataBlockEncoding.NONE), and instead do most operations through a KeyValue interface. After a lot of benchmarking and research on compiler optimizations, I'm pretty sure we are far away from any performance issues with an interface if people are worried about that... i can comment more on that later if anyone wants.

          Show
          Matt Corgan added a comment - DataBlockEncoding gives us the opportunity to add features to a KeyValue interface but create different implementations that have the option of ignoring some of the features. For HBASE-4676 (PrefixTrie), I need to create an interface so that I can store the different parts of a KeyValue in different backing arrays: https://github.com/hotpads/hbase-prefix-trie/blob/master/src/org/apache/hadoop/hbase/cell/HCell.java This discussion reminds me of the memstoreTS in the current KeyValue which is more of an afterthought and has negative performance implications. With DataBlockEncoding I'm able to treat it as a first class citizen when it is needed and encode it away when it's not needed. I suppose there is a small cpu cost, but really small relative to what else is happening. In the future, I think we're going to want to treat the legacy KeyValue and legacy LinkedList<KeyValue> style data blocks as just another DataBlockEncoding (DataBlockEncoding.NONE), and instead do most operations through a KeyValue interface. After a lot of benchmarking and research on compiler optimizations, I'm pretty sure we are far away from any performance issues with an interface if people are worried about that... i can comment more on that later if anyone wants.
          Hide
          Jonathan Hsieh added a comment -

          Do we think of the goal of faithfully mimicking accumulo's is reasonable? (my first impression is yes). If so, I think we'd need to define those semantics and to do that we'd need bring up an accumulo cluster (or spend more time in the code) to see what its semantics are.

          Here's the accumulo api. (ColumnVisibility is tag expression)
          http://accumulo.apache.org/1.4/apidocs/org/apache/accumulo/core/data/Mutation.html

          You can see that delete mutations in accumulo also have visibility expressions – and I'm not actually sure what the semantics are. Let's say you have a "public" put on row A, and then a "sensitive" delete of the entire row A. Does someone with the only the "public" authorizations see the old row A, or no row A? If it was no row A, would they be able to do a versioned query and get the old row A (and could then infer out that there was a sensitive delete?)...

          Also, I didn't see atomic increment operator in accumulo. If it doesn't exist we'd have to define that behavior. Some options:
          1) No visibility settings on hbase increment mutation operations, thus only keep only one value always readable by all
          2) Add some mechanism to set counter visibility so we could to require authorization to read.
          3) Keep one counter for each visibility tag on a particular cell... (yuck).

          Show
          Jonathan Hsieh added a comment - Do we think of the goal of faithfully mimicking accumulo's is reasonable? (my first impression is yes). If so, I think we'd need to define those semantics and to do that we'd need bring up an accumulo cluster (or spend more time in the code) to see what its semantics are. Here's the accumulo api. (ColumnVisibility is tag expression) http://accumulo.apache.org/1.4/apidocs/org/apache/accumulo/core/data/Mutation.html You can see that delete mutations in accumulo also have visibility expressions – and I'm not actually sure what the semantics are. Let's say you have a "public" put on row A, and then a "sensitive" delete of the entire row A. Does someone with the only the "public" authorizations see the old row A, or no row A? If it was no row A, would they be able to do a versioned query and get the old row A (and could then infer out that there was a sensitive delete?)... Also, I didn't see atomic increment operator in accumulo. If it doesn't exist we'd have to define that behavior. Some options: 1) No visibility settings on hbase increment mutation operations, thus only keep only one value always readable by all 2) Add some mechanism to set counter visibility so we could to require authorization to read. 3) Keep one counter for each visibility tag on a particular cell... (yuck).
          Hide
          Andrew Purtell added a comment -

          @Matt So attributes carry in information and we use a new encoding type to store tags adjacent to KVs if they are labelled?

          @Jon For counters definitely not 1, option 2 seems better.

          Show
          Andrew Purtell added a comment - @Matt So attributes carry in information and we use a new encoding type to store tags adjacent to KVs if they are labelled? @Jon For counters definitely not 1, option 2 seems better.
          Hide
          stack added a comment -

          @Lars When you say "...I thought it might be good to have the option to add "tags" to each KV", what would a tag look like? A bit set? Or some name/values within the KV?

          I like Jon's suggestion that HBASE-2893 could be less onerous if we implemented Locality groups (the meta-cf would be co-mingled w/ the data its meta for).

          They delta encode their keys (including the visiblity tags) like we do in 0.94.

          Should we turn the above on as default?

          As a straw man, a core-changing HBase version that modified KV's wouldn't necessarily incur more space – we could use a new KeyValue.Type (TaggedPut?) that had visibility settings, and use the old Put type that used the default tags.

          KVs are also versioned so we should be able to up the version and add new in-key facility (A while back I messed w/ doing this adding a sequenceid beside the timestamp... I did it as "conditional code" rather than as subclass which for sure compounded the complexity of the already-complex KeyValue – unfortunately, hopefully there is a better route – but it seemed possible making different KV types float in same scanner merge....). If we add a ColumnVisibility-like expression into the KV, we'd have to update Comparators to exclude this portion from inclusion in sort.

          On making KV an Interface, that'd be cool. Todd had a go at it a while back but the ripple turned into an avalanche of code changes IIRC so he suggested we do it piecemeal. It'd be sweet though.

          +1 on Andrew's proposal first. We need a driver?

          On faithful mimicking of Accumulo's implementation, that makes sense in the absence of an actual customer who needs this stuff (Jon? You have one?)? Maybe the Accumulo implementation is far from what is wanted – these 'expression's are passed in the clear – and a superior implementation might not be that much of a stretch?

          Show
          stack added a comment - @Lars When you say "...I thought it might be good to have the option to add "tags" to each KV", what would a tag look like? A bit set? Or some name/values within the KV? I like Jon's suggestion that HBASE-2893 could be less onerous if we implemented Locality groups (the meta-cf would be co-mingled w/ the data its meta for). They delta encode their keys (including the visiblity tags) like we do in 0.94. Should we turn the above on as default? As a straw man, a core-changing HBase version that modified KV's wouldn't necessarily incur more space – we could use a new KeyValue.Type (TaggedPut?) that had visibility settings, and use the old Put type that used the default tags. KVs are also versioned so we should be able to up the version and add new in-key facility (A while back I messed w/ doing this adding a sequenceid beside the timestamp... I did it as "conditional code" rather than as subclass which for sure compounded the complexity of the already-complex KeyValue – unfortunately, hopefully there is a better route – but it seemed possible making different KV types float in same scanner merge....). If we add a ColumnVisibility-like expression into the KV, we'd have to update Comparators to exclude this portion from inclusion in sort. On making KV an Interface, that'd be cool. Todd had a go at it a while back but the ripple turned into an avalanche of code changes IIRC so he suggested we do it piecemeal. It'd be sweet though. +1 on Andrew's proposal first. We need a driver? On faithful mimicking of Accumulo's implementation, that makes sense in the absence of an actual customer who needs this stuff (Jon? You have one?)? Maybe the Accumulo implementation is far from what is wanted – these 'expression's are passed in the clear – and a superior implementation might not be that much of a stretch?
          Hide
          Lars George added a comment -

          @Lars When you say "...I thought it might be good to have the option to add "tags" to each KV", what would a tag look like? A bit set? Or some name/values within the KV?

          I would assume the latter, a map of key values for each tag.

          Please note that I would like for these tags NOT just being used for security, but also as something we can use in a broader sense. Like the per KV memstoreTS idea. They can be optionally attached to the KVs when needed. The approach does not change the existing KVs nor any comparator code. Maybe the above is not a good example as that seems something that is eventually needed for all KVs and would warrant for a more tighten terrain into the key and comparators. But I am hoping you get the idea?

          So far the other approaches (if discussed on that level yet at all) are trying to store the security in collateral places, which I think is not good enough to be compared with Accumulo. But I might be wrong. I personally think that having the ACLs outside the KV asks for situations where the info is lost. I think the data is so important, it should be part of the KV itself. Just thinking out loud.

          Show
          Lars George added a comment - @Lars When you say "...I thought it might be good to have the option to add "tags" to each KV", what would a tag look like? A bit set? Or some name/values within the KV? I would assume the latter, a map of key values for each tag. Please note that I would like for these tags NOT just being used for security, but also as something we can use in a broader sense. Like the per KV memstoreTS idea. They can be optionally attached to the KVs when needed. The approach does not change the existing KVs nor any comparator code. Maybe the above is not a good example as that seems something that is eventually needed for all KVs and would warrant for a more tighten terrain into the key and comparators. But I am hoping you get the idea? So far the other approaches (if discussed on that level yet at all) are trying to store the security in collateral places, which I think is not good enough to be compared with Accumulo. But I might be wrong. I personally think that having the ACLs outside the KV asks for situations where the info is lost. I think the data is so important, it should be part of the KV itself. Just thinking out loud.
          Hide
          Lars George added a comment -

          If we add a ColumnVisibility-like expression into the KV, we'd have to update Comparators to exclude this portion from inclusion in sort.

          That is the whole point of my proposal up top, using the new type and attaching the tag bag after the key, the default code path stays untouched.

          Sure this needs some extra conditional statements, but is easy to test for. The rest of the tests need no overhaul. All the other approaches strike me as overkill - although they might be worthwhile in their own right. So the question is, what weapon do we choose, so that we solve one problem and hopefully get the most collateral benefit?

          Converting KVs into an interface and subclassing them, as you say, seems as well as a major overhaul of the code base. Do we really want this right now? And what does it give us besides the basis for the problem at hand?

          Show
          Lars George added a comment - If we add a ColumnVisibility-like expression into the KV, we'd have to update Comparators to exclude this portion from inclusion in sort. That is the whole point of my proposal up top, using the new type and attaching the tag bag after the key, the default code path stays untouched. Sure this needs some extra conditional statements, but is easy to test for. The rest of the tests need no overhaul. All the other approaches strike me as overkill - although they might be worthwhile in their own right. So the question is, what weapon do we choose, so that we solve one problem and hopefully get the most collateral benefit? Converting KVs into an interface and subclassing them, as you say, seems as well as a major overhaul of the code base. Do we really want this right now? And what does it give us besides the basis for the problem at hand?
          Hide
          Lars George added a comment -

          @Matt If you need to add this anyways, then I would prefer to do this for your problem as well as this one. This kills two issues with one stone. It would also address the future extensibility that adding the KV interface was trying to address anyways.

          Have you had a chance to go through Todd's attempt and see if you were able to resolve all of the concerns there? As Stack says, solve the "ripple effect"?

          I like this (or my original proposal) best compared to adding meta columns, which seem to also call for a fundamental rearchitecture to add locality groups proper. Feel free to convince me otherwise

          Show
          Lars George added a comment - @Matt If you need to add this anyways, then I would prefer to do this for your problem as well as this one. This kills two issues with one stone. It would also address the future extensibility that adding the KV interface was trying to address anyways. Have you had a chance to go through Todd's attempt and see if you were able to resolve all of the concerns there? As Stack says, solve the "ripple effect"? I like this (or my original proposal) best compared to adding meta columns, which seem to also call for a fundamental rearchitecture to add locality groups proper. Feel free to convince me otherwise
          Hide
          Matt Corgan added a comment -

          @Matt So attributes carry in information and we use a new encoding type to store tags adjacent to KVs if they are labelled?

          Yeah, basically you would be required to use the new encoding format (TaggedPrefixTrie, for example), and if you switched back to DataBlockEncoding.NONE, your tags would be erased (we'd want to make that scenario appropriately difficult of course.). The trie would de-duplicate them, take up no space if they aren't there, and use >=1 byte per cell index if they are there.

          Have you had a chance to go through Todd's attempt and see if you were able to resolve all of the concerns there?

          I was aware Todd tried this at some point but haven't looked into the details or talked to him about it. I assume he was attempting a full "extract interface" style refactoring, which I think is probably too big of a change. My approach is a little different and sets in place a strategy for gradually migrating to the interface. From the top of the HCell class linked to above, you can see these methods that require any implementation to be able to create legacy KeyValues.

          //V1 is the byte[] format of the current KeyValue implementation.
          //for building a result to send to client
          int getV1KeyLength();
          int getV1Length();
          byte[] getV1KeyValueArray();
          KeyValue getKeyValue();
          ByteBuffer getV1KeyBuffer();
          
          int appendKeyToV1ByteArray(byte[] output, int offset);
          int appendToV1ByteArray(byte[] output, int offset);
          

          This lets the interface start in the data block encoders and slowly grow out into the storefiles, filters, etc, and eventually out to the client. Eventually we deprecate those methods from the interface and try to finalize the transition. One problem with this approach with regard to the tags is that only the storage layer would know about them at first, and it would take some time to expose them to the client.

          All-in-all, the tags raise some interesting questions in my mind like "why isn't everything in the KV implemented as a tag?" For example, i'd love to see per-cell TTL's someday. Maybe tags go hand-in-hand with co-processors to implement features like that?

          Show
          Matt Corgan added a comment - @Matt So attributes carry in information and we use a new encoding type to store tags adjacent to KVs if they are labelled? Yeah, basically you would be required to use the new encoding format (TaggedPrefixTrie, for example), and if you switched back to DataBlockEncoding.NONE, your tags would be erased (we'd want to make that scenario appropriately difficult of course.). The trie would de-duplicate them, take up no space if they aren't there, and use >=1 byte per cell index if they are there. Have you had a chance to go through Todd's attempt and see if you were able to resolve all of the concerns there? I was aware Todd tried this at some point but haven't looked into the details or talked to him about it. I assume he was attempting a full "extract interface" style refactoring, which I think is probably too big of a change. My approach is a little different and sets in place a strategy for gradually migrating to the interface. From the top of the HCell class linked to above, you can see these methods that require any implementation to be able to create legacy KeyValues. //V1 is the byte [] format of the current KeyValue implementation. // for building a result to send to client int getV1KeyLength(); int getV1Length(); byte [] getV1KeyValueArray(); KeyValue getKeyValue(); ByteBuffer getV1KeyBuffer(); int appendKeyToV1ByteArray( byte [] output, int offset); int appendToV1ByteArray( byte [] output, int offset); This lets the interface start in the data block encoders and slowly grow out into the storefiles, filters, etc, and eventually out to the client. Eventually we deprecate those methods from the interface and try to finalize the transition. One problem with this approach with regard to the tags is that only the storage layer would know about them at first, and it would take some time to expose them to the client. All-in-all, the tags raise some interesting questions in my mind like "why isn't everything in the KV implemented as a tag?" For example, i'd love to see per-cell TTL's someday. Maybe tags go hand-in-hand with co-processors to implement features like that?
          Hide
          Andrew Purtell added a comment -

          @Lars

          I was following up to this point:

          I like this (or my original proposal) best compared to adding meta columns, which seem to also call for a fundamental rearchitecture to add locality groups proper.

          That's not essential to storing labels in a metacolumn, though it may be advisable for performance reasons.

          Show
          Andrew Purtell added a comment - @Lars I was following up to this point: I like this (or my original proposal) best compared to adding meta columns, which seem to also call for a fundamental rearchitecture to add locality groups proper. That's not essential to storing labels in a metacolumn, though it may be advisable for performance reasons.
          Hide
          Andrew Purtell added a comment -

          @Matt

          @Matt So attributes carry in information and we use a new encoding type to store tags adjacent to KVs if they are labelled?

          Yeah, basically you would be required to use the new encoding format (TaggedPrefixTrie, for example), and if you switched back to DataBlockEncoding.NONE, your tags would be erased (we'd want to make that scenario appropriately difficult of course.). The trie would de-duplicate them, take up no space if they aren't there, and use >=1 byte per cell index if they are there.

          So we may have this as a way to store tags inline with data, with dedup/optimize away if not needed; and we may have Lars' somehow tag structure addition to KV (Lars: what would that look like?). Worth doing a bake-off?

          Show
          Andrew Purtell added a comment - @Matt @Matt So attributes carry in information and we use a new encoding type to store tags adjacent to KVs if they are labelled? Yeah, basically you would be required to use the new encoding format (TaggedPrefixTrie, for example), and if you switched back to DataBlockEncoding.NONE, your tags would be erased (we'd want to make that scenario appropriately difficult of course.). The trie would de-duplicate them, take up no space if they aren't there, and use >=1 byte per cell index if they are there. So we may have this as a way to store tags inline with data, with dedup/optimize away if not needed; and we may have Lars' somehow tag structure addition to KV (Lars: what would that look like?). Worth doing a bake-off?
          Hide
          Andrew Purtell added a comment -

          All-in-all, the tags raise some interesting questions in my mind like "why isn't everything in the KV implemented as a tag?" For example, i'd love to see per-cell TTL's someday. Maybe tags go hand-in-hand with co-processors to implement features like that?

          The metacolumn idea was actually the result of some brainstorming on how to do per-cell TTLs. By no means do I raise it as a potentially superior alternative, simply one implementation strategy that wouldn't touch core. However tags in KV would be another way to go that touches core only once.

          We could agree on criteria such as:

          • Tag storage optimized out if no tags present
          • Compartmentalized changes
          • Generic mechanism for adding, reading, removing, and modifying tags, usable by coprocessors.

          maybe?

          Show
          Andrew Purtell added a comment - All-in-all, the tags raise some interesting questions in my mind like "why isn't everything in the KV implemented as a tag?" For example, i'd love to see per-cell TTL's someday. Maybe tags go hand-in-hand with co-processors to implement features like that? The metacolumn idea was actually the result of some brainstorming on how to do per-cell TTLs. By no means do I raise it as a potentially superior alternative, simply one implementation strategy that wouldn't touch core. However tags in KV would be another way to go that touches core only once. We could agree on criteria such as: Tag storage optimized out if no tags present Compartmentalized changes Generic mechanism for adding, reading, removing, and modifying tags, usable by coprocessors. maybe?
          Hide
          Andrew Purtell added a comment -

          @Jon, Stack: No we don't have to mimic the Accumulo API though if the goal here is to be an alternative, it must be possible to build a direct API translation shim that provides the same labelling and visibility semantisc.

          Show
          Andrew Purtell added a comment - @Jon, Stack: No we don't have to mimic the Accumulo API though if the goal here is to be an alternative, it must be possible to build a direct API translation shim that provides the same labelling and visibility semantisc.
          Hide
          stack added a comment -

          I would assume the latter, a map of key values for each tag.

          This strikes me as odd; keyvalues within the keyvalues (and then keyvalues on the keyvalues in keyvalue) and then along comes Matt w/ his "why isn't everything in the KV implemented as a tag?" which is a reasonable question. A response would be because we don't want to have to have exotic deserialization and sort but I suppose with Trie we might have that already? It'd certainly be 'flexible' which is what you want LarsG? (A nice objective that usually comes at performance cost). A core of required's with optional tags that don't cost unless you use them would be grand.

          I think the data is so important, it should be part of the KV itself.

          Good point. Maybe not even lost, mayhaps a bug would cause us skip the metacolumn?

          ....attaching the tag bag after the key.

          This seems like it would not require much of a refactor (would have to somehow mark where key ends and tag/bag starts).

          Show
          stack added a comment - I would assume the latter, a map of key values for each tag. This strikes me as odd; keyvalues within the keyvalues (and then keyvalues on the keyvalues in keyvalue) and then along comes Matt w/ his "why isn't everything in the KV implemented as a tag?" which is a reasonable question. A response would be because we don't want to have to have exotic deserialization and sort but I suppose with Trie we might have that already? It'd certainly be 'flexible' which is what you want LarsG? (A nice objective that usually comes at performance cost). A core of required's with optional tags that don't cost unless you use them would be grand. I think the data is so important, it should be part of the KV itself. Good point. Maybe not even lost, mayhaps a bug would cause us skip the metacolumn? ....attaching the tag bag after the key. This seems like it would not require much of a refactor (would have to somehow mark where key ends and tag/bag starts).
          Hide
          Matt Corgan added a comment -

          Worth doing a bake-off?

          If we didn't want to wait for the KV interface development to percolate out to the client code (could take a while), then we'd have to add the tag info to the current KeyValue via Lars's suggestion or something like it so it can work with existing mechanisms like the memstore, client, filters, etc. The upside is that we wouldn't have to be super-paranoid about adding bloat to the KeyValue since it will someday become just one of the cell implementations. I guess I'm saying it's maybe ok to muck up the current KV even more given that data block encoding should be able to clean up the mess down the road. That being said, I don't personally need this feature so I hate to suggest mucking up anything!

          Show
          Matt Corgan added a comment - Worth doing a bake-off? If we didn't want to wait for the KV interface development to percolate out to the client code (could take a while), then we'd have to add the tag info to the current KeyValue via Lars's suggestion or something like it so it can work with existing mechanisms like the memstore, client, filters, etc. The upside is that we wouldn't have to be super-paranoid about adding bloat to the KeyValue since it will someday become just one of the cell implementations. I guess I'm saying it's maybe ok to muck up the current KV even more given that data block encoding should be able to clean up the mess down the road. That being said, I don't personally need this feature so I hate to suggest mucking up anything!
          Hide
          Laxman added a comment -

          Good to see lot of interest and discussion in this.

          I couldn't understand the above discussion completely.
          But, I've some basic questions.
          1) Do we really have any known use-cases for KV based access control?
          2) Does scalability affected badly for the users who needs at KV level access control?

          How about other approach of supporting access control through HBase views?
          If we want to restrict access for a CF/CQ and a set of rows to a user/group then create a view for that set and control access through view.

          It's just an idea based on my understanding of RDBMS (Oracle). Not really sure if this is really feasible in HBase. Please respond with use-cases and comments.

          Show
          Laxman added a comment - Good to see lot of interest and discussion in this. I couldn't understand the above discussion completely. But, I've some basic questions. 1) Do we really have any known use-cases for KV based access control? 2) Does scalability affected badly for the users who needs at KV level access control? How about other approach of supporting access control through HBase views ? If we want to restrict access for a CF/CQ and a set of rows to a user/group then create a view for that set and control access through view. It's just an idea based on my understanding of RDBMS (Oracle). Not really sure if this is really feasible in HBase. Please respond with use-cases and comments.
          Hide
          Lars George added a comment -

          @Andy That's not essential to storing labels in a metacolumn, though it may be advisable for performance reasons.

          Understood. I am not saying this is needed or that metacolumns do not work without. In fact, I think that they are very useful in the context you discussed with Matt, i.e. for example TTLs. I personally think there is a need for two optional features: a) metacolumns - which cover broader rules for a many columns or rows, and b) KV tags - which are carried as low as they can get to retain per cell information.

          So for TTL I would think that the tags are too low, yet for security I do think that metacolumns are too "weak" of a guarantee.

          @Andy and @Matt: So we may have this as a way to store tags inline with data, with dedup/optimize away if not needed; and we may have Lars' somehow tag structure addition to KV (Lars: what would that look like?). Worth doing a bake-off?

          I think this is not either or, but - and Matt please correct me if mistaken - if we add Trie compression then we can leverage the implementation to handle it. If we decide not to merge the two, then we can use my suggestion of adding them to the KV optionally and we can handle the compression implications later.

          @Andy: We could agree on criteria such as: Tag storage optimized out if no tags present

          Indeed, since we use a new type, no extra storage is needed if no tag is attached.

          @Andy: Compartmentalized changes

          Agreed, we add a new type and handle that case separately. Though the majority of the code is shared, the new type would trigger the extraction of the tags if called for (which I assume would be done lazily).

          @Andy: Generic mechanism for adding, reading, removing, and modifying tags, usable by coprocessors.

          These are the KeyValue.addTag(byte[] name, byte[] value) and KeyValue.getTag(byte[] name) helpers I was referring to. The coprocessors has full access that way, since the tags are carried for each KV.

          @Andy: No we don't have to mimic the Accumulo API though if the goal here is to be an alternative, it must be possible to build a direct API translation shim that provides the same labelling and visibility semantisc.

          Indeed. One of the arguments I hear comparing HBase and Accumulo is the fact that we have no cell level security tagging. That is what this is all about. My proposal is - as much as I can tell - lean (as it uses no extra storage if not used), can be combined with the non-cell level security (you might not want this level of security to avoid extra baggage), does not change the comparators, and overall is quite non-intrusive in existing code. On the other hand it seems useful for other cell level features in the future.

          As Jon says, Accumulo uses these tags and the always-on filter to achieve security (on a very high level view), and so can we then. For me this is comparable then. We do not need to comply to the entire API, but feature set level only.

          @stack: A core of required's with optional tags that don't cost unless you use them would be grand.

          That is exactly my point. As for "KV in KV", I do not see how this is "odd" as our KeyValue for starters is the odd one given what most people understand of what a KV is. Coming to terms with our complex key and various sorting rules is not trivial.

          @stack: Good point. Maybe not even lost, mayhaps a bug would cause us skip the metacolumn?

          Spot on!

          @Matt: I guess I'm saying it's maybe ok to muck up the current KV even more given that data block encoding should be able to clean up the mess down the road. That being said, I don't personally need this feature so I hate to suggest mucking up anything!

          Agreed, this is about timing as well. Your patch is highly intrusive - but for good reasons. So I would love to discuss this current issue with your changes already applied. But on the other hand we have to make a call for what we want and when?

          @Laxman: The basic premise here is to be on-par security wise with Accumulo. That is the use-case. As for scalability, I do not see why a few extra bytes and a coprocessor that checks them is disastrous. Sure, this needs evaluation, but we know that other systems - like Accumulo - does it, so if someone wants to enable it, they should see the same impact. Small or big. Or asking the other way around, where do you see this could affect the performance?

          How about other approach of supporting access control through HBase views?

          The issue is that these are typically only on the row level. With the cell level you can filter as fine grained as possible. Views - and please object if I am wrong - are more coarse grained. Think of blocking access to some columns differently across many rows. Not just all CF/CQs allowed for all rows.

          That latter is the crucial difference of what is needed to be on-par.

          Show
          Lars George added a comment - @Andy That's not essential to storing labels in a metacolumn, though it may be advisable for performance reasons. Understood. I am not saying this is needed or that metacolumns do not work without. In fact, I think that they are very useful in the context you discussed with Matt, i.e. for example TTLs. I personally think there is a need for two optional features: a) metacolumns - which cover broader rules for a many columns or rows, and b) KV tags - which are carried as low as they can get to retain per cell information. So for TTL I would think that the tags are too low, yet for security I do think that metacolumns are too "weak" of a guarantee. @Andy and @Matt: So we may have this as a way to store tags inline with data, with dedup/optimize away if not needed; and we may have Lars' somehow tag structure addition to KV (Lars: what would that look like?). Worth doing a bake-off? I think this is not either or, but - and Matt please correct me if mistaken - if we add Trie compression then we can leverage the implementation to handle it. If we decide not to merge the two, then we can use my suggestion of adding them to the KV optionally and we can handle the compression implications later. @Andy: We could agree on criteria such as: Tag storage optimized out if no tags present Indeed, since we use a new type, no extra storage is needed if no tag is attached. @Andy: Compartmentalized changes Agreed, we add a new type and handle that case separately. Though the majority of the code is shared, the new type would trigger the extraction of the tags if called for (which I assume would be done lazily). @Andy: Generic mechanism for adding, reading, removing, and modifying tags, usable by coprocessors. These are the KeyValue.addTag(byte[] name, byte[] value) and KeyValue.getTag(byte[] name) helpers I was referring to. The coprocessors has full access that way, since the tags are carried for each KV. @Andy: No we don't have to mimic the Accumulo API though if the goal here is to be an alternative, it must be possible to build a direct API translation shim that provides the same labelling and visibility semantisc. Indeed. One of the arguments I hear comparing HBase and Accumulo is the fact that we have no cell level security tagging. That is what this is all about. My proposal is - as much as I can tell - lean (as it uses no extra storage if not used), can be combined with the non-cell level security (you might not want this level of security to avoid extra baggage), does not change the comparators, and overall is quite non-intrusive in existing code. On the other hand it seems useful for other cell level features in the future. As Jon says, Accumulo uses these tags and the always-on filter to achieve security (on a very high level view), and so can we then. For me this is comparable then. We do not need to comply to the entire API, but feature set level only. @stack: A core of required's with optional tags that don't cost unless you use them would be grand. That is exactly my point. As for "KV in KV", I do not see how this is "odd" as our KeyValue for starters is the odd one given what most people understand of what a KV is. Coming to terms with our complex key and various sorting rules is not trivial. @stack: Good point. Maybe not even lost, mayhaps a bug would cause us skip the metacolumn? Spot on! @Matt: I guess I'm saying it's maybe ok to muck up the current KV even more given that data block encoding should be able to clean up the mess down the road. That being said, I don't personally need this feature so I hate to suggest mucking up anything! Agreed, this is about timing as well. Your patch is highly intrusive - but for good reasons. So I would love to discuss this current issue with your changes already applied. But on the other hand we have to make a call for what we want and when? @Laxman: The basic premise here is to be on-par security wise with Accumulo. That is the use-case. As for scalability, I do not see why a few extra bytes and a coprocessor that checks them is disastrous. Sure, this needs evaluation, but we know that other systems - like Accumulo - does it, so if someone wants to enable it, they should see the same impact. Small or big. Or asking the other way around, where do you see this could affect the performance? How about other approach of supporting access control through HBase views? The issue is that these are typically only on the row level. With the cell level you can filter as fine grained as possible. Views - and please object if I am wrong - are more coarse grained. Think of blocking access to some columns differently across many rows. Not just all CF/CQs allowed for all rows. That latter is the crucial difference of what is needed to be on-par.
          Hide
          Laxman added a comment -

          The basic premise here is to be on-par security wise with Accumulo. That is the use-case.

          IMHO, that's one implementation but not use-case. Definitely, Accumulo would have some straight use-case. Do we that use-case? Based on use-case, we can brainstorm on different approaches (KV level, Views, something else may be).

          where do you see this could affect the performance?

          I have following concern w.r.to scalability.

          • With current implementation, ACLs are cached. With cell level, it may grow heavily.
          • Please take a look @AccessController.permissionGranted method. We need to call this method(+some more checks for KV based) for every KV. This may become a hotspot when we introduce KV based access control.

          We are currently evaluating performance with security enabled. Soon, I will share our report.

          Think of blocking access to some columns differently across many rows.

          I agree. Can you please explain how do we solve this with a traditional RDBMS like Oracle.

          Note: I definitely don't want to bring up the well known discussion "SQL vs NOSQL" here and I'm only trying to understand the use-case as a HBase user/developer. Only point I want to put forward is we should have proper understanding of use-case and user before we start on a approach/solution.

          Show
          Laxman added a comment - The basic premise here is to be on-par security wise with Accumulo. That is the use-case. IMHO, that's one implementation but not use-case. Definitely, Accumulo would have some straight use-case. Do we that use-case? Based on use-case, we can brainstorm on different approaches (KV level, Views, something else may be). where do you see this could affect the performance? I have following concern w.r.to scalability . With current implementation, ACLs are cached. With cell level, it may grow heavily. Please take a look @AccessController.permissionGranted method. We need to call this method(+some more checks for KV based) for every KV. This may become a hotspot when we introduce KV based access control. We are currently evaluating performance with security enabled. Soon, I will share our report. Think of blocking access to some columns differently across many rows. I agree. Can you please explain how do we solve this with a traditional RDBMS like Oracle. Note: I definitely don't want to bring up the well known discussion "SQL vs NOSQL" here and I'm only trying to understand the use-case as a HBase user/developer. Only point I want to put forward is we should have proper understanding of use-case and user before we start on a approach/solution.
          Hide
          Anoop Sam John added a comment -

          I dont think the approach that we are discussing is, to keep the ACL details of each cell in the acl table. So that wont be a concern as such

          Show
          Anoop Sam John added a comment - I dont think the approach that we are discussing is, to keep the ACL details of each cell in the acl table. So that wont be a concern as such
          Hide
          Laxman added a comment -

          Thanks for clarification Anoop. So, one of the concerns w.r.to cache can be ruled out for further discussion.

          Show
          Laxman added a comment - Thanks for clarification Anoop. So, one of the concerns w.r.to cache can be ruled out for further discussion.
          Hide
          Keith Turner added a comment -

          The security label in the Accumulo key is called Column Visibility. Its part of the sort order and uniqueness of an
          Accumulo key. The key is sorted on row, then column fam, column qual, then column vis, then timestamp, then delete flag. So for example the following entries would be distinct and sort in the following way.

          row=1234 cf=name cq=last cv=S&GOLF ts=5 val=Smith
          row=1234 cf=name cq=last cv=S&GOLF ts=4 val=Smit
          row=1234 cf=name cq=last cv=U ts=9 val=Doe

          Show
          Keith Turner added a comment - The security label in the Accumulo key is called Column Visibility. Its part of the sort order and uniqueness of an Accumulo key. The key is sorted on row, then column fam, column qual, then column vis, then timestamp, then delete flag. So for example the following entries would be distinct and sort in the following way. row=1234 cf=name cq=last cv=S&GOLF ts=5 val=Smith row=1234 cf=name cq=last cv=S&GOLF ts=4 val=Smit row=1234 cf=name cq=last cv=U ts=9 val=Doe
          Hide
          Lars George added a comment -

          A comparable concept in the RDBMS world is Oracle's label security:

          http://www.oracle.com/us/products/database/options/label-security/overview/index.html

          This is much more involved as it has labels, departments, and so on. It also combines default labels with runtime ones, making up the actual value to check against. The data labels are stored in a hidden column in the table and is for the entire row.

          Accumulo stores the labels with the policy in each KV, for example "A&B&(D|E)". The system level filter evaluates the labels and compares them to the actual authorization details of the user. It then let's the user access the data or not. So this is simpler compared to OLS.

          I was thinking that adding "tags" is the actual support feature to enable the same functionality. Then we need a coprocessor to apply the rules. The part we do not have here is the authorization against the labels. The labels in Accumulo are created ad-hoc, which means we would need to attach the user authorization in the ACL table, but that can be cached.

          Show
          Lars George added a comment - A comparable concept in the RDBMS world is Oracle's label security: http://www.oracle.com/us/products/database/options/label-security/overview/index.html This is much more involved as it has labels, departments, and so on. It also combines default labels with runtime ones, making up the actual value to check against. The data labels are stored in a hidden column in the table and is for the entire row. Accumulo stores the labels with the policy in each KV, for example "A&B&(D|E)". The system level filter evaluates the labels and compares them to the actual authorization details of the user. It then let's the user access the data or not. So this is simpler compared to OLS. I was thinking that adding "tags" is the actual support feature to enable the same functionality. Then we need a coprocessor to apply the rules. The part we do not have here is the authorization against the labels. The labels in Accumulo are created ad-hoc, which means we would need to attach the user authorization in the ACL table, but that can be cached.
          Hide
          Andrew Purtell added a comment - - edited

          Correct me if I'm wrong but given all of the above discussion therefore the scope of changes necessary are:

          • Add a facility for storing string tags with each KV. Will need to do this for both puts and deletes.
          • Update the KV comparators to take into account tags in the sort order, if present.
          • Add a coprocessor (perhaps another implementation of Constraint) that:
            • Implements a visibility filter that evaluates for each KV if the current set of user-supplied labels intersects with allowed labels according to visibility predicates stored in the KV, if any, and emits (or not);
            • Unpacks labels added by the client via our generic op attributes facility and adds them to KVs in a mutator op or programs the visibility filter on a Get or Scan;
            • Always adds the visibility filter to Get and Scan operations, wrapping any other supplied filter with FilterList(MUST_PASS_ALL) and the visibility filter on the RHS so we can let the other filters reduce the number of visibility evaluations necessary.
          • Shell support
            • Add setuaths and getauths equivalent shell commands
            • Extend the other shell ops with sugar for adding labels as a working example of how to do it via the Java client

          And the constraints on implementation are:

          • Tag storage optimized out if no tags present
          • Compartmentalized changes
          • The mechanism for adding, reading, removing, and modifying tags in the KV should be generic

          Does that about sum it up?

          @Laxman: Such a Constraint-based coprocessor as described would be orthogonal to AccessController. One could use both at the same time independently, or they could be unified at some point.

          @Lars

          @Andy: Compartmentalized changes

          Agreed, we add a new type and handle that case separately. Though the majority of the code is shared, the new type would trigger the extraction of the tags if called for (which I assume would be done lazily).

          So one new type that causes us to extract tags then recurse on the actual wrapped op? And we extend all necessary internal method signatures to carry tags around? Edit: By that I mean a pointer to where they are serialized in byte[] if doing it lazily as you suggest.

          Edit: Removed one line of language where I went off in the weeds a bit.

          Show
          Andrew Purtell added a comment - - edited Correct me if I'm wrong but given all of the above discussion therefore the scope of changes necessary are: Add a facility for storing string tags with each KV. Will need to do this for both puts and deletes. Update the KV comparators to take into account tags in the sort order, if present. Add a coprocessor (perhaps another implementation of Constraint) that: Implements a visibility filter that evaluates for each KV if the current set of user-supplied labels intersects with allowed labels according to visibility predicates stored in the KV, if any, and emits (or not); Unpacks labels added by the client via our generic op attributes facility and adds them to KVs in a mutator op or programs the visibility filter on a Get or Scan; Always adds the visibility filter to Get and Scan operations, wrapping any other supplied filter with FilterList(MUST_PASS_ALL) and the visibility filter on the RHS so we can let the other filters reduce the number of visibility evaluations necessary. Shell support Add setuaths and getauths equivalent shell commands Extend the other shell ops with sugar for adding labels as a working example of how to do it via the Java client And the constraints on implementation are: Tag storage optimized out if no tags present Compartmentalized changes The mechanism for adding, reading, removing, and modifying tags in the KV should be generic Does that about sum it up? @Laxman: Such a Constraint-based coprocessor as described would be orthogonal to AccessController. One could use both at the same time independently, or they could be unified at some point. @Lars @Andy: Compartmentalized changes Agreed, we add a new type and handle that case separately. Though the majority of the code is shared, the new type would trigger the extraction of the tags if called for (which I assume would be done lazily). So one new type that causes us to extract tags then recurse on the actual wrapped op? And we extend all necessary internal method signatures to carry tags around? Edit: By that I mean a pointer to where they are serialized in byte[] if doing it lazily as you suggest. Edit: Removed one line of language where I went off in the weeds a bit.
          Hide
          Jonathan Hsieh added a comment -

          From my point of view, I'd like really like to understand more than just accumulo's implementation – I really care about if accumulo's semantics are 1) intentional and required for accumulo use cases and 2) if applications only use a constrained sets of its capabilities. One specific thing I don't quite understand is the ramifications of having column visibility settings are encoded as part of the key and sort order. This could be equivalent expressions that are no longer equals, and some of somewhat goofy future/past views.

          Another thought: At a high level it seems odd for the co-processor to constrain what can be seen – we definitely would not want to let a "normal" client view the raw underlaying tags or visibility metadata tables!

          IMO I'd probably prefer a completely generic tag system for HBase only after we have a few different serious use cases (possibly existing ones!) that would use it. Making something overly generic introduces its own set of new problems.

          @Keith I'll search on the accumulo mailing to see if things will answer my semantics questions – and if I don't find it I'll shoot off some questions there.

          Show
          Jonathan Hsieh added a comment - From my point of view, I'd like really like to understand more than just accumulo's implementation – I really care about if accumulo's semantics are 1) intentional and required for accumulo use cases and 2) if applications only use a constrained sets of its capabilities. One specific thing I don't quite understand is the ramifications of having column visibility settings are encoded as part of the key and sort order. This could be equivalent expressions that are no longer equals, and some of somewhat goofy future/past views. Another thought: At a high level it seems odd for the co-processor to constrain what can be seen – we definitely would not want to let a "normal" client view the raw underlaying tags or visibility metadata tables! IMO I'd probably prefer a completely generic tag system for HBase only after we have a few different serious use cases (possibly existing ones!) that would use it. Making something overly generic introduces its own set of new problems. @Keith I'll search on the accumulo mailing to see if things will answer my semantics questions – and if I don't find it I'll shoot off some questions there.
          Hide
          Andrew Purtell added a comment -

          @Jon You just argued for a metacolumn, no?

          Show
          Andrew Purtell added a comment - @Jon You just argued for a metacolumn, no?
          Hide
          Andrew Purtell added a comment -

          At a high level it seems odd for the co-processor to constrain what can be seen – we definitely would not want to let a "normal" client view the raw underlaying tags or visibility metadata tables!

          There would not be a "normal" client in this case, the coprocessor would obviously be installed as a system level coprocessor and the visibility filter would therefore always be active. Didn't mean to imply otherwise somehow. Certainly that's not how I read the comment stream above.

          Show
          Andrew Purtell added a comment - At a high level it seems odd for the co-processor to constrain what can be seen – we definitely would not want to let a "normal" client view the raw underlaying tags or visibility metadata tables! There would not be a "normal" client in this case, the coprocessor would obviously be installed as a system level coprocessor and the visibility filter would therefore always be active. Didn't mean to imply otherwise somehow. Certainly that's not how I read the comment stream above.
          Hide
          Eric Newton added a comment -

          For clarification, Accumulo does not support negation in the visibility expressions. "!" may be reserved for future use, but expressions are limited to terms, "&" or "|" and parens for precedence.

          Visibility expressions are not required: the metadata table does not use them, for example.

          The visibility needs to be part of the key in order to have different values at different visibilities.

          @Jonathan - enjoyed your talk @hadoop summit.

          Show
          Eric Newton added a comment - For clarification, Accumulo does not support negation in the visibility expressions. "!" may be reserved for future use, but expressions are limited to terms, "&" or "|" and parens for precedence. Visibility expressions are not required: the metadata table does not use them, for example. The visibility needs to be part of the key in order to have different values at different visibilities. @Jonathan - enjoyed your talk @hadoop summit.
          Hide
          Lars George added a comment -

          As a note, I raised the question of what we need as per the use-cases we see at customers. We are in the process of evaluating this and I should be able to share the requirements here sometime soon.

          One of the questions is where we put policies: either in the labels and therefore the tags on each KV, or in a separate place that applies dynamic runtime policies that match the user to the data labels.

          Show
          Lars George added a comment - As a note, I raised the question of what we need as per the use-cases we see at customers. We are in the process of evaluating this and I should be able to share the requirements here sometime soon. One of the questions is where we put policies: either in the labels and therefore the tags on each KV, or in a separate place that applies dynamic runtime policies that match the user to the data labels.
          Hide
          Jerry Chen added a comment -

          @Jon

          From my point of view, I'd like really like to understand more than just accumulo's implementation – I really care about if accumulo's semantics are 1) intentional and required for accumulo use cases and 2) if applications only use a constrained sets of its capabilities. One specific thing I don't quite understand is the ramifications of having column visibility settings are encoded as part of the key and sort order. This could be equivalent expressions that are no longer equals, and some of somewhat goofy future/past views.

          I looked into the accumulo implementation. And one thing that accumulo want to achieve by making the ColumnVisibility as a part of the column key is that the authorization can be enforced without reading the existing records that may be affected by the current mutation. Because the ColumnVisibility is part of the key, you need to explicitly give the ColumnVisibility to identify/match to the column you are targeting to change (put or delete), and thus VisibilityConstraint check can be performed on the given ColumnVisibility with the user's authorization tokens to make sure the user has been authorized for performing the mutation logically on an existing column, without scanning the existing columns that it may make changes on. HBase and accumulo are at the same situation for this.

          There may some other issues to be addressed if something is not part of the key and also not part of the value when multiple versions of a logical column existed. For example, a Put with new Visibility values of the key of

          {row1, family1:qualifier1}

          will make an logical changes on all the cells with key

          {row1, family1:qualifier1}

          , and thus authorization must be checked over all these affected items (which may with different Visibility values) with the user authorization to see whether the Put can be performed or not. And the deletion gets the same thing to consider when considering DeleteFamily and DeleteColumn which logically affect a lot of columns that may have different Visibility values).

          One important change on the client API when the Visibility is part of the column key is the Visibility need to be specified either explicitly or implicitly (such as empty Visibility is used when no Visibility provided in the parameters) when performing Put or Delete mutations. This does seem a little strange at a first glance when comparing with approaches used by traditional database row level authorization such as Oracle Label Security. But the question is do we have other better choices both solve the problem and fit into the current framework?

          Show
          Jerry Chen added a comment - @Jon From my point of view, I'd like really like to understand more than just accumulo's implementation – I really care about if accumulo's semantics are 1) intentional and required for accumulo use cases and 2) if applications only use a constrained sets of its capabilities. One specific thing I don't quite understand is the ramifications of having column visibility settings are encoded as part of the key and sort order. This could be equivalent expressions that are no longer equals, and some of somewhat goofy future/past views. I looked into the accumulo implementation. And one thing that accumulo want to achieve by making the ColumnVisibility as a part of the column key is that the authorization can be enforced without reading the existing records that may be affected by the current mutation. Because the ColumnVisibility is part of the key, you need to explicitly give the ColumnVisibility to identify/match to the column you are targeting to change (put or delete), and thus VisibilityConstraint check can be performed on the given ColumnVisibility with the user's authorization tokens to make sure the user has been authorized for performing the mutation logically on an existing column, without scanning the existing columns that it may make changes on. HBase and accumulo are at the same situation for this. There may some other issues to be addressed if something is not part of the key and also not part of the value when multiple versions of a logical column existed. For example, a Put with new Visibility values of the key of {row1, family1:qualifier1} will make an logical changes on all the cells with key {row1, family1:qualifier1} , and thus authorization must be checked over all these affected items (which may with different Visibility values) with the user authorization to see whether the Put can be performed or not. And the deletion gets the same thing to consider when considering DeleteFamily and DeleteColumn which logically affect a lot of columns that may have different Visibility values). One important change on the client API when the Visibility is part of the column key is the Visibility need to be specified either explicitly or implicitly (such as empty Visibility is used when no Visibility provided in the parameters) when performing Put or Delete mutations. This does seem a little strange at a first glance when comparing with approaches used by traditional database row level authorization such as Oracle Label Security. But the question is do we have other better choices both solve the problem and fit into the current framework?
          Hide
          stack added a comment -

          Jerry Chen A put that broadened visibility would be for the current put only? How would it effect already-put values? (I can see how an Increment would have to check current visibility setting to see if it was allowed increment it – ditto for check and put) The DeleteFamily, DeleteColumn case is interesting. HBase would have to change how it does these operation pretty radically – at least the way we do delete family. Lars George Anything to report? It has to be per KV? Or could it be metacolumn/Oracle Label Security-like?

          Show
          stack added a comment - Jerry Chen A put that broadened visibility would be for the current put only? How would it effect already-put values? (I can see how an Increment would have to check current visibility setting to see if it was allowed increment it – ditto for check and put) The DeleteFamily, DeleteColumn case is interesting. HBase would have to change how it does these operation pretty radically – at least the way we do delete family. Lars George Anything to report? It has to be per KV? Or could it be metacolumn/Oracle Label Security-like?
          Hide
          Jerry Chen added a comment -

          Stack

          A put that broadened visibility would be for the current put only? How would it effect already-put values?

          When the visibility is part of the column key, a broadened visibility will not affect the existing columns that with the same

          {rowid, family, qualifier}

          but with different visibilities. Thus, the put will only affect the columns that has the same

          {rowid, family, qualifier, visibility}. Different visibilities has the same effect as different qualifiers.

          While as to DeleteFamily or DeleteColumn, Accumulo doesn't have such as operations. It has only Delete mutation to delete a specified {rowid, family, qualifier, visibility}

          . The idea to keep DeleteFamily and DeleteColumn still working with visibility in HBase is that A DeleteFamily operation now will only affect "all columns in this family with the specified visibility" other than originally "all columns in this family". The same with DeleteColumn.

          One thing to consider if the visibility is part of the key. As there are suggestions to provide support for general tags for KV so that not only visibility tags can be stored in it, but also other tags that needed in the future can be added easily. Will general tags concept (comparing to a visibility tag) makes the concept of the column key too complex?

          Show
          Jerry Chen added a comment - Stack A put that broadened visibility would be for the current put only? How would it effect already-put values? When the visibility is part of the column key, a broadened visibility will not affect the existing columns that with the same {rowid, family, qualifier} but with different visibilities. Thus, the put will only affect the columns that has the same {rowid, family, qualifier, visibility}. Different visibilities has the same effect as different qualifiers. While as to DeleteFamily or DeleteColumn, Accumulo doesn't have such as operations. It has only Delete mutation to delete a specified {rowid, family, qualifier, visibility} . The idea to keep DeleteFamily and DeleteColumn still working with visibility in HBase is that A DeleteFamily operation now will only affect "all columns in this family with the specified visibility" other than originally "all columns in this family". The same with DeleteColumn. One thing to consider if the visibility is part of the key. As there are suggestions to provide support for general tags for KV so that not only visibility tags can be stored in it, but also other tags that needed in the future can be added easily. Will general tags concept (comparing to a visibility tag) makes the concept of the column key too complex?
          Hide
          Andrew Purtell added a comment -

          I am currently exploring/coding a couple of implementation options in parallel. Please ping me if you're working in this area.

          Show
          Andrew Purtell added a comment - I am currently exploring/coding a couple of implementation options in parallel. Please ping me if you're working in this area.
          Hide
          Himanshu Vashishtha added a comment -

          Andrew Purtell We would like to collaborate on adding this feature. What are the approaches you are considering?

          We also looked into this feature a while ago, and came up with the attached PRD and design. It will be great to have your feedback on it.

          Show
          Himanshu Vashishtha added a comment - Andrew Purtell We would like to collaborate on adding this feature. What are the approaches you are considering? We also looked into this feature a while ago, and came up with the attached PRD and design. It will be great to have your feedback on it.
          Hide
          Andrew Purtell added a comment -

          Thanks for attaching the design documents. I could comment on them but am pursing a substantially different direction.

          Regarding KV tags:

          Right now I'm experimenting with code that adds tags to KVs in a backwards compatible way. I would call this a transitional change. We overload the existing KV binary format, technically the tags are prepended to value data but are not considered part of it. For KVs that contain tags, the value length is encoded as a negative integer. When a negative value length is encountered, we walk over the value part of the KV, parsing out tags, until we reach an end marker. Once we know the size of the tag data, we adjust the value offset and value length returned to API callers. The parsing of tags in the KV is cached after the first time it is performed. Thus no change to KV handling has to happen outside of the KeyValue class.

          Regarding cell level permissions:

          Instead of visibility labels we extend the AccessController to support per cell ACLs. These are serialized TablePermissions stored with a KV in a tag. We add a helper class UserTablePermissions which supports both PB and Writable serialization of Listmultimap<String,TablePermission>. The permissions checking of the AccessController is extended in a straightforward way to check if any KV passing through the AccessControlFilter contains tags containing ACLs. If so, an access grant check is first performed using TablePermissions deserialized from the tag(s). This allows per cell ACLs to overload CF or table level permissions wherever they might exist. While the TablePermission type is not strictly needed for this (Permission would suffice), it was easier to simply allow a TablePermission to store nulls for table, cf, and qualifier than make more extensive changes to the AccessController. Doing so could be considered a refinement. This does mean that the AccessControlFilter implementation is changed to pass back SKIP instead of NEXT_COL hints, so every KV is examined by the filter, and use of the filter to wrap Gets and Scans is now unconditional. The performance impact of this has yet to be quantified but is expected not to be that different from changes that require per-KV visibility labels to be evaluated (and could be less expensive since we are not evaluating a mini label language).

          No API changes are necessarily required. The new UserTablePermissions type is serialized to PB and then passed as an attribute using the existing API facility for that. Not all mutation types are supported. It doesn't make sense to attach an ACL on a Delete, because the KVs covered by the delete will be ... deleted. Additionally, Increment is not currently supported because it doesn't support attributes (doesn't extend either Mutation nor OperationWithAttributes) and anyway the upsert code paths don't pass KVs through a coprocessor down into HBase core so bundling tags with the user value doesn't work there. Also, while I haven't done this quite yet, I anticipate the upsert code will require a second core modification to preserve any tags that might be in the value to update.

          Obviously my goal is to add per-cell ("per-KeyValue") security through as simple and straightforward extensions of the existing AccessController implementation as possible, with minimal to no changes to core code*. I do not aim for Accumulo parity nor support for multi label security schemes. It may be more appropriate to post this work to another JIRA when it is ready for review. This work could be considered orthogonal to any label based approach, and could probably play nicely together with it should one come to exist, as long as both are based on a generic KV tag facility.

          • - I am also evaluating an option that avoids making any change to KeyValue, thus no change to core HBase at all. Instead it stores the equivalent of KV tags in "shadow KVs" stored in a shadow metacolumn. Implementation of this is underway but as you can imagine it's not so straightforward to produce something that could perform (nearly) as well as inline in-KV checks. This approach wouldn't have the limitation regarding Increment nor the requirement to change the upsert code.

          All of the above is subject to change.

          Show
          Andrew Purtell added a comment - Thanks for attaching the design documents. I could comment on them but am pursing a substantially different direction. Regarding KV tags: Right now I'm experimenting with code that adds tags to KVs in a backwards compatible way. I would call this a transitional change. We overload the existing KV binary format, technically the tags are prepended to value data but are not considered part of it. For KVs that contain tags, the value length is encoded as a negative integer. When a negative value length is encountered, we walk over the value part of the KV, parsing out tags, until we reach an end marker. Once we know the size of the tag data, we adjust the value offset and value length returned to API callers. The parsing of tags in the KV is cached after the first time it is performed. Thus no change to KV handling has to happen outside of the KeyValue class. Regarding cell level permissions: Instead of visibility labels we extend the AccessController to support per cell ACLs. These are serialized TablePermissions stored with a KV in a tag. We add a helper class UserTablePermissions which supports both PB and Writable serialization of Listmultimap<String,TablePermission>. The permissions checking of the AccessController is extended in a straightforward way to check if any KV passing through the AccessControlFilter contains tags containing ACLs. If so, an access grant check is first performed using TablePermissions deserialized from the tag(s). This allows per cell ACLs to overload CF or table level permissions wherever they might exist. While the TablePermission type is not strictly needed for this (Permission would suffice), it was easier to simply allow a TablePermission to store nulls for table, cf, and qualifier than make more extensive changes to the AccessController. Doing so could be considered a refinement. This does mean that the AccessControlFilter implementation is changed to pass back SKIP instead of NEXT_COL hints, so every KV is examined by the filter, and use of the filter to wrap Gets and Scans is now unconditional. The performance impact of this has yet to be quantified but is expected not to be that different from changes that require per-KV visibility labels to be evaluated (and could be less expensive since we are not evaluating a mini label language). No API changes are necessarily required. The new UserTablePermissions type is serialized to PB and then passed as an attribute using the existing API facility for that. Not all mutation types are supported. It doesn't make sense to attach an ACL on a Delete, because the KVs covered by the delete will be ... deleted. Additionally, Increment is not currently supported because it doesn't support attributes (doesn't extend either Mutation nor OperationWithAttributes) and anyway the upsert code paths don't pass KVs through a coprocessor down into HBase core so bundling tags with the user value doesn't work there. Also, while I haven't done this quite yet, I anticipate the upsert code will require a second core modification to preserve any tags that might be in the value to update. Obviously my goal is to add per-cell ("per-KeyValue") security through as simple and straightforward extensions of the existing AccessController implementation as possible, with minimal to no changes to core code*. I do not aim for Accumulo parity nor support for multi label security schemes. It may be more appropriate to post this work to another JIRA when it is ready for review. This work could be considered orthogonal to any label based approach, and could probably play nicely together with it should one come to exist, as long as both are based on a generic KV tag facility. - I am also evaluating an option that avoids making any change to KeyValue, thus no change to core HBase at all. Instead it stores the equivalent of KV tags in "shadow KVs" stored in a shadow metacolumn. Implementation of this is underway but as you can imagine it's not so straightforward to produce something that could perform (nearly) as well as inline in-KV checks. This approach wouldn't have the limitation regarding Increment nor the requirement to change the upsert code. All of the above is subject to change.
          Hide
          Ted Yu added a comment -

          @Himanshu:
          I have some questions about your design doc.

          Versions will be kept for each unique visibility expression.

          Would this inflate memstore because we are keeping potentially many more versions of KeyValue which differ by visibility expression only ?

          HTable-level property: a property “ENABLE_CELL_LEVEL_SECURITY” in

          Since HFile metadata would include similar information, looks like storing such information at column family level is better.

          The property “ENABLE_CELL_LEVEL_SECURITY” can be change only after enabling/disabling the table.

          I assume that the table is in disabled state when this property is changed.

          on page 12:

          It attaches the CVFilter to the Get object,

          Suppose a user belongs to more than one group, would multiple CVFilter instances be attached to the Get object ?

          it passes only the ones which pass the “Secret” visibility expression.He will get v1 and v4.

          The above is inconsistent with description on page 11 where v2 and v4 are said to be returned.

          Thanks

          Show
          Ted Yu added a comment - @Himanshu: I have some questions about your design doc. Versions will be kept for each unique visibility expression. Would this inflate memstore because we are keeping potentially many more versions of KeyValue which differ by visibility expression only ? HTable-level property: a property “ENABLE_CELL_LEVEL_SECURITY” in Since HFile metadata would include similar information, looks like storing such information at column family level is better. The property “ENABLE_CELL_LEVEL_SECURITY” can be change only after enabling/disabling the table. I assume that the table is in disabled state when this property is changed. on page 12: It attaches the CVFilter to the Get object, Suppose a user belongs to more than one group, would multiple CVFilter instances be attached to the Get object ? it passes only the ones which pass the “Secret” visibility expression.He will get v1 and v4. The above is inconsistent with description on page 11 where v2 and v4 are said to be returned. Thanks
          Hide
          Himanshu Vashishtha added a comment -

          Andrew PurtellThanks for the writeup. Yes, your approach is different. Few thoughts:

          a) Its not clear how/where you are storing the acls at keyvalue level. You use acl table, or something else? Reading through this gives an idea that the keyvalue level acls have null table, cf and cq attribute with them. Does this mean a user can have only one type of KV level acls in the application. Or in other words, how does the KV acl looks like in the acl table?

          b) Currently, all the acl entries are stored in zk (limit of znode is 1 mb); will you be using the same approach?

          It doesn't make sense to attach an ACL on a Delete, because the KVs covered by the delete will be ... deleted.

          I don't completely agree with this but will not comment also unless I completely understand your approach.

          Ted Yu Thanks for taking time to read through it.

          Versions will be kept for each unique visibility expression.Would this inflate memstore because we are keeping potentially many more versions of KeyValue which differ by visibility expression only ?

          Or HFile, right? If so, yes, as different visibility expressions provides different access control to user. If user A does two Put with Visibility A|B, C and then does a delete with visibility A|B, another user with acl of C should be able to see this.

          re: ENABLE_CELL_LEVEL_SECURITY
          I initially visualized a table level (will give some more thought about CF level... as mentioned in the doc also)
          Yes, change it after disabling the table as it will flush out the memstore.

          re: Multiple CVFilter
          No, only one should be good enough. How come you got this idea? I should fix the doc.

          re: Typo:
          Yes, he will get v2 and v4. Thanks for pointing this out.

          Show
          Himanshu Vashishtha added a comment - Andrew Purtell Thanks for the writeup. Yes, your approach is different. Few thoughts: a) Its not clear how/where you are storing the acls at keyvalue level. You use acl table, or something else? Reading through this gives an idea that the keyvalue level acls have null table, cf and cq attribute with them. Does this mean a user can have only one type of KV level acls in the application. Or in other words, how does the KV acl looks like in the acl table? b) Currently, all the acl entries are stored in zk (limit of znode is 1 mb); will you be using the same approach? It doesn't make sense to attach an ACL on a Delete, because the KVs covered by the delete will be ... deleted. I don't completely agree with this but will not comment also unless I completely understand your approach. Ted Yu Thanks for taking time to read through it. Versions will be kept for each unique visibility expression.Would this inflate memstore because we are keeping potentially many more versions of KeyValue which differ by visibility expression only ? Or HFile, right? If so, yes, as different visibility expressions provides different access control to user. If user A does two Put with Visibility A|B, C and then does a delete with visibility A|B, another user with acl of C should be able to see this. re: ENABLE_CELL_LEVEL_SECURITY I initially visualized a table level (will give some more thought about CF level... as mentioned in the doc also) Yes, change it after disabling the table as it will flush out the memstore. re: Multiple CVFilter No, only one should be good enough. How come you got this idea? I should fix the doc. re: Typo: Yes, he will get v2 and v4. Thanks for pointing this out.
          Hide
          Andrew Purtell added a comment -

          Himanshu Vashishtha Thanks for the gracious comments.

          Currently, all the acl entries are stored in zk (limit of znode is 1 mb); will you be using the same approach?

          No. Pursuing an incremental approach, so, we keep per-table and per-CF ACLs implemented as-is for now. Then add per cell ACLs as a third level of permissions checking. Per cell ACLs are stored in the region alongside the data. It's an open question if having an ACL table will still be necessary as things evolve.

          Its not clear how/where you are storing the acls at keyvalue level.

          There are different approaches to doing this, will try two of them to establish the extent of changes to core code required and performance differences:

          1. Store per cell ACLs in tags in KV, using the approach to putting tags in KV that I described above.

          2. Store per cell ACLs in a "shadow ACL column". Per cell ACLs are still associated with individual KVs but in effect tag storage is moved external to the data KV, so no changes to the KeyValue implementation are necessary, and stored in the same region. Will incur some additional cost for querying tags but will not require intra cluster RPC.

          3. Store per cell ACLs in the ACL table. Tag storage again is external to the data KV, and now in a separate table. Will incur cost for querying tags and furthermore require intra cluster RPC. Probably won't actually try this.

          You use acl table, or something else?

          The ACL is stored "in" the KV (depends which impl option if that is physical or logical). The table, row, family, qualifier, and timestamp of an in-KV permission are implicit – those of the KV's location. I do use TablePermission to represent ACLs at the KV level but only to avoid some refactoring in the AccessController that would be outside the scope of a proof of concept. Pardon if mentioning this detail was confusing.

          It doesn't make sense to attach an ACL on a Delete, because the KVs covered by the delete will be ... deleted.

          I don't completely agree with this but will not comment also unless I completely understand your approach.

          We treat ACLs on a KV as timestamped like the rest of the KV. An ACL in a new Put applies only to that Put. It doesn't change the ACL of any previous Put. However, we require a Put to have covering[1] permission – so either table perms, CF perms, or perms in the most recent visible existing value, if any, must allow the pending Put (or Append, or Increment, or any mutation) in order for it to be applied. This is because the pending operation will replace the visible value and its ACL, if it has one, for Gets and Scans that don't ask for more than one version. If you actually want to change the ACL on a specific KV, you must Delete then Put, if you have perms for that. On reads, a KV is visible to the InternalScanner if either table perms, CF perms, or perms stored in the KV allow, similar to visibility labels in a sense but with ACL semantics. The AccessControlFilter is extended to do that[2]. For Deletes, all visible prior versions covered by the Delete must allow the Delete, because a major compaction could remove them if we allow the tombstone and it covers them, regardless of any ACL they might contain. "Visible" here is defined as not covered by an already committed tombstone. This allows simple and straightforward evolution of security policy over time without requiring expensive updates[3].

          So a Delete doesn't update ACLs. ACLs are bits of information to be stored with the KV. It's meaningless to put an ACL on an op that stores nothing. And, an ACL on a Delete would not grant that Delete permission to do something. It would be the union of per table, per CF, and all ACLs on visible covered values that would grant (or not). Since we are only looking at the most recent version of a KV, if any, except for Deletes, only Delete - and especially DeleteColumn - can be potentially expensive to check. There is no free lunch but union-of-ACL semantics helps: Granting appropriate per table or per CF perms to a user or group obviates the need for KV ACL checking for that user or her groups, so we can early out and avoid going to the store in that case.

          [1] Maybe "covering" isn't quite the best term. Suggestions welcome.

          [2] This changes AC semantics some. A user who doesn't have permissions to a CF or exact location (column+qualifier) just won't get values back from a scan or get, as opposed to previous behavior which would throw an AccessDeniedException if the user did not have required perms to the CF at the CF level. I refactored TestAccessController to address this in a clean way.

          [3] On the other hand GRANT and REVOKE commands could provide a convenient way to change per KV ACLs too, since they are DDL not DML ops. We could consider enhancing the GRANT and REVOKE command processing in the AccessController to replace all per KV ACLs covered by the action, if they exist. A possible optimization is lazy updates via a timestamped log of recent GRANTs and REVOKEs (but I just thought of that now so it's a superficial suggestion).

          Show
          Andrew Purtell added a comment - Himanshu Vashishtha Thanks for the gracious comments. Currently, all the acl entries are stored in zk (limit of znode is 1 mb); will you be using the same approach? No. Pursuing an incremental approach, so, we keep per-table and per-CF ACLs implemented as-is for now. Then add per cell ACLs as a third level of permissions checking. Per cell ACLs are stored in the region alongside the data. It's an open question if having an ACL table will still be necessary as things evolve. Its not clear how/where you are storing the acls at keyvalue level. There are different approaches to doing this, will try two of them to establish the extent of changes to core code required and performance differences: 1. Store per cell ACLs in tags in KV, using the approach to putting tags in KV that I described above. 2. Store per cell ACLs in a "shadow ACL column". Per cell ACLs are still associated with individual KVs but in effect tag storage is moved external to the data KV, so no changes to the KeyValue implementation are necessary, and stored in the same region. Will incur some additional cost for querying tags but will not require intra cluster RPC. 3. Store per cell ACLs in the ACL table. Tag storage again is external to the data KV, and now in a separate table. Will incur cost for querying tags and furthermore require intra cluster RPC. Probably won't actually try this. You use acl table, or something else? The ACL is stored "in" the KV (depends which impl option if that is physical or logical). The table, row, family, qualifier, and timestamp of an in-KV permission are implicit – those of the KV's location. I do use TablePermission to represent ACLs at the KV level but only to avoid some refactoring in the AccessController that would be outside the scope of a proof of concept. Pardon if mentioning this detail was confusing. It doesn't make sense to attach an ACL on a Delete, because the KVs covered by the delete will be ... deleted. I don't completely agree with this but will not comment also unless I completely understand your approach. We treat ACLs on a KV as timestamped like the rest of the KV. An ACL in a new Put applies only to that Put. It doesn't change the ACL of any previous Put. However, we require a Put to have covering [1] permission – so either table perms, CF perms, or perms in the most recent visible existing value, if any, must allow the pending Put (or Append, or Increment, or any mutation) in order for it to be applied. This is because the pending operation will replace the visible value and its ACL, if it has one, for Gets and Scans that don't ask for more than one version. If you actually want to change the ACL on a specific KV, you must Delete then Put, if you have perms for that. On reads, a KV is visible to the InternalScanner if either table perms, CF perms, or perms stored in the KV allow, similar to visibility labels in a sense but with ACL semantics. The AccessControlFilter is extended to do that [2] . For Deletes, all visible prior versions covered by the Delete must allow the Delete, because a major compaction could remove them if we allow the tombstone and it covers them, regardless of any ACL they might contain. "Visible" here is defined as not covered by an already committed tombstone. This allows simple and straightforward evolution of security policy over time without requiring expensive updates [3] . So a Delete doesn't update ACLs. ACLs are bits of information to be stored with the KV. It's meaningless to put an ACL on an op that stores nothing. And, an ACL on a Delete would not grant that Delete permission to do something. It would be the union of per table, per CF, and all ACLs on visible covered values that would grant (or not). Since we are only looking at the most recent version of a KV, if any, except for Deletes, only Delete - and especially DeleteColumn - can be potentially expensive to check. There is no free lunch but union-of-ACL semantics helps: Granting appropriate per table or per CF perms to a user or group obviates the need for KV ACL checking for that user or her groups, so we can early out and avoid going to the store in that case. [1] Maybe "covering" isn't quite the best term. Suggestions welcome. [2] This changes AC semantics some. A user who doesn't have permissions to a CF or exact location (column+qualifier) just won't get values back from a scan or get, as opposed to previous behavior which would throw an AccessDeniedException if the user did not have required perms to the CF at the CF level. I refactored TestAccessController to address this in a clean way. [3] On the other hand GRANT and REVOKE commands could provide a convenient way to change per KV ACLs too, since they are DDL not DML ops. We could consider enhancing the GRANT and REVOKE command processing in the AccessController to replace all per KV ACLs covered by the action, if they exist. A possible optimization is lazy updates via a timestamped log of recent GRANTs and REVOKEs (but I just thought of that now so it's a superficial suggestion).
          Hide
          Andrew Purtell added a comment -

          I forgot to mention something and want to highlight something.

          The new KVs for Append and Increment inherit the KV tags (hence ACLs) of the value(s) they are updating. This is both logical and works around the fact that Increment is not a Mutation (see HBASE-7114).

          If Himanshu Vashishtha or someone else wants to implement Accumulo-style visibility labels (or if I ultimately end up doing it), then I encourage following the same design principles:

          • Coprocessor based implementation
          • Minimal to no changes to core code
          • Perhaps just building on a KV generic tags facility
          • Use OperationWithAttributes# {get,set}

            Attribute for passing through your new metadata

          Then, you can see how what I describe above and perhaps something else that implements Accumulo-style visibility labels can be consistent in deployment and API details and can be easily stacked on top of each other.

          Show
          Andrew Purtell added a comment - I forgot to mention something and want to highlight something. The new KVs for Append and Increment inherit the KV tags (hence ACLs) of the value(s) they are updating. This is both logical and works around the fact that Increment is not a Mutation (see HBASE-7114 ). If Himanshu Vashishtha or someone else wants to implement Accumulo-style visibility labels (or if I ultimately end up doing it), then I encourage following the same design principles: Coprocessor based implementation Minimal to no changes to core code Perhaps just building on a KV generic tags facility Use OperationWithAttributes# {get,set} Attribute for passing through your new metadata Then, you can see how what I describe above and perhaps something else that implements Accumulo-style visibility labels can be consistent in deployment and API details and can be easily stacked on top of each other .
          Hide
          stack added a comment -

          If cp fails to load, will all the data then become visible Andrew? Anything we can do to have it default to be non-viewable unless via a visibility CP (Otherwise, adding generic tags facility to kv w/ visibility done in a visibility cp sounds sweet)

          Show
          stack added a comment - If cp fails to load, will all the data then become visible Andrew? Anything we can do to have it default to be non-viewable unless via a visibility CP (Otherwise, adding generic tags facility to kv w/ visibility done in a visibility cp sounds sweet)
          Hide
          Andrew Purtell added a comment -

          If cp fails to load, will all the data then become visible Andrew?

          This is straightforward to address. The CP framework can fail to open a region if required CPs cannot be loaded and initialized. This is HBASE-6873.

          Show
          Andrew Purtell added a comment - If cp fails to load, will all the data then become visible Andrew? This is straightforward to address. The CP framework can fail to open a region if required CPs cannot be loaded and initialized. This is HBASE-6873 .
          Hide
          Andrew Purtell added a comment -

          Split into subtasks.

          Show
          Andrew Purtell added a comment - Split into subtasks.
          Hide
          Andrew Purtell added a comment - - edited

          Attached 6222-aclcf.patch. This is the implementation option which uses a shadow column family for cell ACL metadata storage.

          Edit: I had to context switch to something else and am coming back to this now. I would like to update this patch with more unit tests so should not be considered a final version of anything.

          Show
          Andrew Purtell added a comment - - edited Attached 6222-aclcf.patch. This is the implementation option which uses a shadow column family for cell ACL metadata storage. Edit: I had to context switch to something else and am coming back to this now. I would like to update this patch with more unit tests so should not be considered a final version of anything.
          Hide
          Andrew Purtell added a comment -

          Attached 'cell-acls-kv-tags-not-for-review.zip', an implementation option that uses inline storage for ACL data.

          Show
          Andrew Purtell added a comment - Attached 'cell-acls-kv-tags-not-for-review.zip', an implementation option that uses inline storage for ACL data.
          Hide
          ramkrishna.s.vasudevan added a comment -
             } while (more);
              scanner.close();
          

          In AccessController.requireCoveringPermission the close should be in finally block?

          // Before per cell ACLs we used to return the NEXT_COL hint, but can no
              // no longer do that since, given the possibility of per cell ACLs
              // anywhere, we now need to examine all KVs with this filter.
          

          May be we can move this before the 'if' condition just before this.

          Show
          ramkrishna.s.vasudevan added a comment - } while (more); scanner.close(); In AccessController.requireCoveringPermission the close should be in finally block? // Before per cell ACLs we used to return the NEXT_COL hint, but can no // no longer do that since, given the possibility of per cell ACLs // anywhere, we now need to examine all KVs with this filter. May be we can move this before the 'if' condition just before this.
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Test tag misses for testCellPermissions(). Was it intentional?

          Show
          ramkrishna.s.vasudevan added a comment - @Test tag misses for testCellPermissions(). Was it intentional?
          Hide
          ramkrishna.s.vasudevan added a comment -

          Andy,
          The memstore flushes if more than 1 CF exists, will it have an impact on this new CF introduced?
          One more question is,
          Once in the AccessController hooks we have ensured that the permission is available by checking the new CF acl, when the actual scan goes we can avoid this Cell right? May be am missing something. Pls correct me if am wrong.

          Show
          ramkrishna.s.vasudevan added a comment - Andy, The memstore flushes if more than 1 CF exists, will it have an impact on this new CF introduced? One more question is, Once in the AccessController hooks we have ensured that the permission is available by checking the new CF acl , when the actual scan goes we can avoid this Cell right? May be am missing something. Pls correct me if am wrong.
          Hide
          Ted Yu added a comment -

          Andrew Purtell:
          The attached document is interesting.
          Do you have performance comparison for writes ?

          Thanks

          Show
          Ted Yu added a comment - Andrew Purtell : The attached document is interesting. Do you have performance comparison for writes ? Thanks
          Hide
          Andrew Purtell added a comment -

          Ted Yu The results for writes are not interesting, boxplots align etc. I would have expected some impact to be noticeable so I plan to look into that more. Stay tuned.

          Show
          Andrew Purtell added a comment - Ted Yu The results for writes are not interesting, boxplots align etc. I would have expected some impact to be noticeable so I plan to look into that more. Stay tuned.
          Hide
          Andrew Purtell added a comment -

          In AccessController.requireCoveringPermission the close should be in finally block?

          Ok.

          May be we can move this before the 'if' condition just before this.

          Ok.

          @Test tag misses for testCellPermissions(). Was it intentional?

          No. Checked the test logs and it still runs, but will add this of course.

          The memstore flushes if more than 1 CF exists, will it have an impact on this new CF introduced?

          The ACL CF is only hidden from the client.

          Once in the AccessController hooks we have ensured that the permission is available by checking the new CF acl, when the actual scan goes we can avoid this Cell right?

          We could modify the Scan object in a preScannerOpen hook to exclude the ACL CF. The values from that family are not used in the filter. (I seem to remember exploring that idea is why no such exclusion presently.)

          Thanks for the review!

          Show
          Andrew Purtell added a comment - In AccessController.requireCoveringPermission the close should be in finally block? Ok. May be we can move this before the 'if' condition just before this. Ok. @Test tag misses for testCellPermissions(). Was it intentional? No. Checked the test logs and it still runs, but will add this of course. The memstore flushes if more than 1 CF exists, will it have an impact on this new CF introduced? The ACL CF is only hidden from the client. Once in the AccessController hooks we have ensured that the permission is available by checking the new CF acl, when the actual scan goes we can avoid this Cell right? We could modify the Scan object in a preScannerOpen hook to exclude the ACL CF. The values from that family are not used in the filter. (I seem to remember exploring that idea is why no such exclusion presently.) Thanks for the review!
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Andy
          May be HBASE-5416 can be used here. Need to go thro the patch once again to see if it is possible.

          Show
          ramkrishna.s.vasudevan added a comment - @Andy May be HBASE-5416 can be used here. Need to go thro the patch once again to see if it is possible.
          Hide
          Ted Yu added a comment - - edited

          I was thinking about HBASE-5416 as well.
          However, user scan may be utilizing HBASE-5416 at the same time.

          Meaning, there might be two essential families.

          Show
          Ted Yu added a comment - - edited I was thinking about HBASE-5416 as well. However, user scan may be utilizing HBASE-5416 at the same time. Meaning, there might be two essential families.
          Hide
          ramkrishna.s.vasudevan added a comment -

          Yes. Going thro the patch once again.
          Basically we say if a family is essential or not as part of Filters.
          So once ACL layer is done we should have a mechanism to avoid a CF ( here it is the acl CF).

          Show
          ramkrishna.s.vasudevan added a comment - Yes. Going thro the patch once again. Basically we say if a family is essential or not as part of Filters. So once ACL layer is done we should have a mechanism to avoid a CF ( here it is the acl CF).
          Hide
          Andrew Purtell added a comment - - edited

          With HBASE-5416 maybe the AccessController can add or wrap any filters on the Scan with a filter that excludes the ACL CF since the AccessControlFilter doesn't consult that data inline with scanner iteration.

          Longer term though most likely the AccessControlFilter will have ACL data available inline with the KVs (as tags). See the alternate implementation attached to this JIRA for an example of how that could work. The separate ACL CF would go away.

          If cell tags can make it into 0.96, then the ACL CF could go away now. 5416 would not be useful in that case.

          Edit: I misread the tail of 5416, so it's in trunk already.

          Show
          Andrew Purtell added a comment - - edited With HBASE-5416 maybe the AccessController can add or wrap any filters on the Scan with a filter that excludes the ACL CF since the AccessControlFilter doesn't consult that data inline with scanner iteration. Longer term though most likely the AccessControlFilter will have ACL data available inline with the KVs (as tags). See the alternate implementation attached to this JIRA for an example of how that could work. The separate ACL CF would go away. If cell tags can make it into 0.96, then the ACL CF could go away now. 5416 would not be useful in that case. Edit: I misread the tail of 5416, so it's in trunk already.
          Hide
          ramkrishna.s.vasudevan added a comment -

          If cell tags can make it into 0.96, then the ACL CF could go away now. 5416 would not be useful in that case

          Ya of course right.

          Show
          ramkrishna.s.vasudevan added a comment - If cell tags can make it into 0.96, then the ACL CF could go away now. 5416 would not be useful in that case Ya of course right.
          Hide
          Andrew Purtell added a comment -

          Updated file 6222.pdf. Updated results for minicluster load tests with latest Hadoop 2.0.4-SNAPSHOT + HBase 0.95-SNAPSHOT (just before branch) and not using multiPuts. Now I'm seeing what I expect.

          Show
          Andrew Purtell added a comment - Updated file 6222.pdf. Updated results for minicluster load tests with latest Hadoop 2.0.4-SNAPSHOT + HBase 0.95-SNAPSHOT (just before branch) and not using multiPuts. Now I'm seeing what I expect.
          Hide
          Andrew Purtell added a comment -

          Some feedback from the Feb 28 HUG: Consider storing a reference to an ACL in the KV (logically in the ACL CF or inline as a tag) instead of the ACL itself. The ACL could be stored, versioned, in the acl table.

          This is an interesting idea. It would address the case where the same ACL is stored in hundreds or thousands (or more) of values. We should save some space by writing the actual ACL only to one location. This also facilitates easy updates of the ACL, as a single atomic change, especially useful in the case where the initial mutations may have included an incorrect ACL. However, the tradeoff is two lookups instead of one for deciding what to do with every KeyValue: first, to the ACL CF to get the reference, if any, associated with the KV (but at least this is within the region); the second to a probably remote region of the acl table to retrieve the ACL itself. It could be possible to cache the results of the ACL data lookups for a limited time, for however long a policy decision can be allowed to be out of date. If we have KV tags, the first lookup goes away but unfortunately the second remains, taking back some (most?) of the performance/latency gains we expect from inline storage.

          I'm not sure the benefits outweigh the drawbacks.

          Show
          Andrew Purtell added a comment - Some feedback from the Feb 28 HUG: Consider storing a reference to an ACL in the KV (logically in the ACL CF or inline as a tag) instead of the ACL itself. The ACL could be stored, versioned, in the acl table. This is an interesting idea. It would address the case where the same ACL is stored in hundreds or thousands (or more) of values. We should save some space by writing the actual ACL only to one location. This also facilitates easy updates of the ACL, as a single atomic change, especially useful in the case where the initial mutations may have included an incorrect ACL. However, the tradeoff is two lookups instead of one for deciding what to do with every KeyValue: first, to the ACL CF to get the reference, if any, associated with the KV (but at least this is within the region); the second to a probably remote region of the acl table to retrieve the ACL itself. It could be possible to cache the results of the ACL data lookups for a limited time, for however long a policy decision can be allowed to be out of date. If we have KV tags, the first lookup goes away but unfortunately the second remains, taking back some (most?) of the performance/latency gains we expect from inline storage. I'm not sure the benefits outweigh the drawbacks.
          Hide
          Nicolas Liochon added a comment -

          If I understand well, the choice depends on the number of different ACL are we expecting and if we have access patterns that would make caching efficient. Do we know this? With a reference based implementation, for a scan it would make sense to first check the reference of the ACL that gives us a read access, and then use this for all the kv in the table. The caching duration policy could be per scan.

          Lastly, if it's considered as "nearly immutable" data, it's possible to use a ZooKeeper node to invalidate the cache on an update. This works well if you have a few update per hour (with as many insert as you like). Then policy is 'infinite cache'.

          Show
          Nicolas Liochon added a comment - If I understand well, the choice depends on the number of different ACL are we expecting and if we have access patterns that would make caching efficient. Do we know this? With a reference based implementation, for a scan it would make sense to first check the reference of the ACL that gives us a read access, and then use this for all the kv in the table. The caching duration policy could be per scan. Lastly, if it's considered as "nearly immutable" data, it's possible to use a ZooKeeper node to invalidate the cache on an update. This works well if you have a few update per hour (with as many insert as you like). Then policy is 'infinite cache'.
          Hide
          Andrew Purtell added a comment -

          the choice depends on the number of different ACL are we expecting and if we have access patterns that would make caching efficient. Do we know this?

          Maybe for a given row. Wouldn't say so in general but there's no "real world" user usage data. We should expect best practice for a very common cell ACL is a factoring of it to out a CF or table grant, to avoid any IO checking cover for mutations. So at the cell level either no data or probably lots of varying ACLs.

          Show
          Andrew Purtell added a comment - the choice depends on the number of different ACL are we expecting and if we have access patterns that would make caching efficient. Do we know this? Maybe for a given row. Wouldn't say so in general but there's no "real world" user usage data. We should expect best practice for a very common cell ACL is a factoring of it to out a CF or table grant, to avoid any IO checking cover for mutations. So at the cell level either no data or probably lots of varying ACLs.
          Hide
          Nicolas Liochon added a comment -

          Ok. For the use cases, do we know how Accumulo is used today?

          Show
          Nicolas Liochon added a comment - Ok. For the use cases, do we know how Accumulo is used today?
          Hide
          Andrew Purtell added a comment -

          For the use cases, do we know how Accumulo is used today?

          Accumulo provides labels, HBASE-7663

          Show
          Andrew Purtell added a comment - For the use cases, do we know how Accumulo is used today? Accumulo provides labels, HBASE-7663
          Hide
          Matt Corgan added a comment -

          My understanding of the goal of per-KeyValue security is that it's to eliminate all doubt that levels above the KeyValue could break or misinterpret permissions. If that's accurate, would adding caching and indirection layers above KeyValue possibly compromise the fundamental design goal?

          Show
          Matt Corgan added a comment - My understanding of the goal of per-KeyValue security is that it's to eliminate all doubt that levels above the KeyValue could break or misinterpret permissions. If that's accurate, would adding caching and indirection layers above KeyValue possibly compromise the fundamental design goal?
          Hide
          Andrew Purtell added a comment -

          If that's accurate, would adding caching and indirection layers above KeyValue possibly compromise the fundamental design goal

          I'm inclined to agree. Certainly it complicates it. Shouldn't be in an initial drop for the sake of establishing correctness first.

          Show
          Andrew Purtell added a comment - If that's accurate, would adding caching and indirection layers above KeyValue possibly compromise the fundamental design goal I'm inclined to agree. Certainly it complicates it. Shouldn't be in an initial drop for the sake of establishing correctness first.
          Hide
          Andrew Purtell added a comment -

          And to be clear I'd like to surface this discussion from the HUG but am not advocating for such changes.

          Show
          Andrew Purtell added a comment - And to be clear I'd like to surface this discussion from the HUG but am not advocating for such changes.
          Hide
          Andrew Purtell added a comment -

          Should we commit the "shadow column family" based approach we have today and migrate later to something based on KV tags, or wait for tags proper?

          Show
          Andrew Purtell added a comment - Should we commit the "shadow column family" based approach we have today and migrate later to something based on KV tags, or wait for tags proper?
          Hide
          Enis Soztutar added a comment -

          Should we commit the "shadow column family" based approach we have today and migrate later to something based on KV tags, or wait for tags proper?

          I think you mentioned that shadow column family approach is considerably slower than a native tag based approach. I think the questions are (1) whether it still justifies given the performance penalty (2) How far along we are from having tags in Cells.

          Show
          Enis Soztutar added a comment - Should we commit the "shadow column family" based approach we have today and migrate later to something based on KV tags, or wait for tags proper? I think you mentioned that shadow column family approach is considerably slower than a native tag based approach. I think the questions are (1) whether it still justifies given the performance penalty (2) How far along we are from having tags in Cells.
          Hide
          Andrew Purtell added a comment -

          I think you mentioned that shadow column family approach is considerably slower than a native tag based approach.

          Some single host benchmarks that illustrate the difference are in the PDF attached to this JIRA. There's a price for security but we would pay less with tags in cells. Of course, if you don't use cell ACLs then it's not an issue either way.

          How far along we are from having tags in Cells.

          If the timeframe for that is a couple of months, I'd recommend waiting. Anyway, this issue has been sitting here for a while, but not on account of the patch not being in a committable state.

          Show
          Andrew Purtell added a comment - I think you mentioned that shadow column family approach is considerably slower than a native tag based approach. Some single host benchmarks that illustrate the difference are in the PDF attached to this JIRA. There's a price for security but we would pay less with tags in cells. Of course, if you don't use cell ACLs then it's not an issue either way. How far along we are from having tags in Cells. If the timeframe for that is a couple of months, I'd recommend waiting. Anyway, this issue has been sitting here for a while, but not on account of the patch not being in a committable state.
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Andy
          I think you need to update the patch with the default deny approach in the absence of KVs granting the action and then commit it if you are planning to.

          Show
          ramkrishna.s.vasudevan added a comment - @Andy I think you need to update the patch with the default deny approach in the absence of KVs granting the action and then commit it if you are planning to.
          Hide
          Andrew Purtell added a comment -

          @Andy
          I think you need to update the patch with the default deny approach in the absence of KVs granting the action and then commit it if you are planning to.

          Thanks ramkrishna.s.vasudevan. I put up the refreshed patches on HBASE-7661.

          Show
          Andrew Purtell added a comment - @Andy I think you need to update the patch with the default deny approach in the absence of KVs granting the action and then commit it if you are planning to. Thanks ramkrishna.s.vasudevan . I put up the refreshed patches on HBASE-7661 .
          Hide
          Anoop Sam John added a comment -

          Great ! All the subtasks are closed now. Good to close this main Jira?

          Show
          Anoop Sam John added a comment - Great ! All the subtasks are closed now. Good to close this main Jira?
          Hide
          Andrew Purtell added a comment -

          Great ! All the subtasks are closed now. Good to close this main Jira?

          Resolved! Let's party.

          Show
          Andrew Purtell added a comment - Great ! All the subtasks are closed now. Good to close this main Jira? Resolved! Let's party.
          Hide
          Lars Hofhansl added a comment -

          Yeah!

          Show
          Lars Hofhansl added a comment - Yeah!
          Hide
          stack added a comment -

          Excellent lads.

          Show
          stack added a comment - Excellent lads.

            People

            • Assignee:
              Unassigned
              Reporter:
              stack
            • Votes:
              0 Vote for this issue
              Watchers:
              55 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development