diff --git src/main/docbkx/appendix_acl_matrix.xml src/main/docbkx/appendix_acl_matrix.xml new file mode 100644 index 0000000..a0ed317 --- /dev/null +++ src/main/docbkx/appendix_acl_matrix.xml @@ -0,0 +1,652 @@ + + + + + Access Control Matrix + The following matrix shows the minimum permission set required to perform operations in + HBase. Before using the table, read through the information about how to interpret it. + + Interpreting the ACL Matrix Table + The following conventions are used in the ACL Matrix table: + + Scopes + + Permissions are evaluated starting at the widest scope and working to the + narrowest scope. A scope corresponds to a level of the data model. From broadest to + narrowest, the scopes are as follows:: + + Global + Namespace (NS) + Table + Column Family (CF) + Column Qualifier (CQ) + Cell + + For instance, a permission granted at table level dominates any grants done at + the Column Family, Column Qualifier, or cell level. The user can do what that + grant implies at any location in the table. A permission granted at global scope + dominates all: the user is always allowed to take that action everywhere. + + + + Permissions + + Possible permissions include the following: + + Superuser - a special user that belongs to group "supergroup" and has + unlimited access + Admin (A) + Create (C) + Write (W) + Read (R) + Execute (X) + + + + + + For the most part, permissions work in an expected way, with the following caveats: + + + Having Write permission does not imply Read permission. It is possible and sometimes + desirable for a user to be able to write data that same user cannot read. One such example + is a log-writing process. + + + The hbase:meta table is readable by every user, regardless + of the user's other grants or restrictions. This is a requirement for HBase to + function correctly. + + + CheckAndPut and CheckAndDelete operations will fail if the user does not have both + Write and Read permission. + + + Increment and Append operations do not require Read access. + + + + The following table is sorted by the interface that provides each operation. In case the + table goes out of date, the unit tests which check for accuracy of permissions can be found + in + hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java, + and the access controls themselves can be examined in + hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java. + + + ACL Matrix + + + + Interface + Operation + Minimum Scope + Minimum Permission + + + + + + + Master + + + createTable + + + Global + + + A + + + + + modifyTable + + + Table + + + A|C + + + + + deleteTable + + + Table + + + A|C + + + + + truncateTable + + + Table + + + A|C + + + + + addColumn + + + Table + + + A|C + + + + + modifyColumn + + + Table + + + A|C + + + + + deleteColumn + + + Table + + + A|C + + + + + disableTable + + + Table + + + A|C + + + + + disableAclTable + + + None + + + Not allowed + + + + + enableTable + + + Table + + + A|C + + + + + move + + + Global + + + A + + + + + assign + + + Global + + + A + + + + + unassign + + + Global + + + A + + + + + regionOffline + + + Global + + + A + + + + + balance + + + Global + + + A + + + + + balanceSwitch + + + Global + + + A + + + + + shutdown + + + Global + + + A + + + + + stopMaster + + + Global + + + A + + + + + snapshot + + + Global + + + A + + + + + clone + + + Global + + + A + + + + + restore + + + Global + + + A + + + + + deleteSnapshot + + + Global + + + A + + + + + createNamespace + + + Global + + + A + + + + + deleteNamespace + + + Namespace + + + A + + + + + modifyNamespace + + + Namespace + + + A + + + + + flushTable + + + Table + + + A|C + + + + + getTableDescriptors + + + Global|Table + + + A + + + + + mergeRegions + + + Global + + + A + + + + Region + + open + Global + A + + + + openRegion + + + Global + + + A + + + + close + Global + A + + + + closeRegion + + + Global + + + A + + + + + stopRegionServer + + + Global + + + A + + + + + mergeRegions + + + Global + + + A + + + + append + Table|CF|CQ + W + + + delete + Table|CF|CQ|Cell (if the user has write permission for all cells) + W + + + exists + Table|CF|CQ + R + + + get + Table|CF|CQ + R + + + getClosestRowBefore + Table|CF|CQ + R + + + increment + Table|CF|CQ + W + + + put + Table|CF|CQ + W + + + + flush + + + Global|Table + + + A|C + + + + + split + + + Global|Table + + + A + + + + + compact + + + Global|Table + + + A|C + + + + bulkLoadHFile + Table + W + + + prepareBulkLoad + Table + C + + + cleanupBulkLoad + Table + W + + + checkAndDelete + Table|CF|CQ + RW + + + checkAndPut + Table|CF|CQ + RW + + + incrementColumnValue + Table|CF|CQ + RW + + + scannerClose + Table + R + + + scannerNext + Table + R + + + scannerOpen + Table|CQ|CF + R + + + + Endpoint + + + invoke + + Endpoint + + X + + + + + AccessController + + + grant + + Global|Table|NS + + A + + + + + revoke + + Global|Table|NS + + A + + + + + getUserPermissions + + + Global|Table|NS + + + A + + + + + checkPermissions + + + Global|Table|NS + + + A + + + + +
+
+ + \ No newline at end of file diff --git src/main/docbkx/book.xml src/main/docbkx/book.xml index 01c6b41..d75d82e 100644 --- src/main/docbkx/book.xml +++ src/main/docbkx/book.xml @@ -5094,6 +5094,7 @@ This option should not normally be used, and it is not in -fixAll. + @@ -5159,7 +5160,7 @@ This option should not normally be used, and it is not in -fixAll. - + Data Block Encoding Types Prefix - Often, keys are very similar. Specifically, keys often share a common prefix diff --git src/main/docbkx/security.xml src/main/docbkx/security.xml index c74af9b..e4a08d8 100644 --- src/main/docbkx/security.xml +++ src/main/docbkx/security.xml @@ -463,1233 +463,1355 @@ grant 'rest_server', 'RWCA' -
- Tags - Every cell can have metadata associated with it. Adding metadata in the data part of - every cell would make things difficult. - The 0.98 version of HBase solves this problem by providing Tags along with the cell - format. Some of the usecases that uses the tags are Visibility labels, Cell level ACLs, etc. - HFile V3 version from 0.98 onwards supports tags and this feature can be turned on using - the following configuration - + Securing Access To Your Data + After you have configured secure authentication between HBase client and server processes + and gateways, you need to consider the security of your data itself. HBase provides several + strategies for securing your data: + + + Role-based Access Control (RBAC) based upon groups and permissions, to control who can + read and write to a given HBase resource or execute a coprocessor endpoint, based upon + their role. + + + Visibility Labels which allow you to label cells and control access to labelled cells, + to further restrict who can read or write to certain subsets of your data. Visibility + labels are stored as tags. See for more information. + + + Transparent encryption of data at rest on the underlying filesystem, both in HFiles + and in the WAL. This protects your data at rest from an attacker who has access to the + underlying filesystem, without the need to change the implementation of the client. It can + also protect against data leakage from improperly disposed disks, which can be important + for legal and regulatory compliance. + + + Server-side configuration, administration, and implementation details of each of these + features is discussed below, along with any performance trade-offs. An example security + configuration is given at the end, to show these features all used together, as they might be + in a real-world scenario. + + All aspects of security in HBase are in active development and evolving rapidly. Any strategy you employ + for security of your data should be thoroughly tested. In addition, some of these features + are still in the experimental stage of development. To take advantage of many of these + features, you must be running HBase 0.98+ and using the HFile v3 file format. + + + + + Basic Server-Side Configuration + + Enable HFile v3, by setting to 3 in + hbase-site.xml. This is the default for HBase 0.99 and + newer. + hfile.format.version 3 - ]]> - Every cell can have zero or more tags. Every tag has a type and the actual tag byte - array. The types 0-31 are reserved for System tags. For example ‘1’ is - reserved for ACL and ‘2’ is reserved for Visibility tags. - The way rowkeys, column families, qualifiers and values are encoded using different - Encoding Algos, similarly the tags can also be encoded. Tag encoding can be turned on per CF. - Default is always turn ON. To turn on the tag encoding on the HFiles use - - Note that encoding of tags takes place only if the DataBlockEncoder is enabled for the - CF. - As we compress the WAL entries using Dictionary the tags present in the WAL can also be - compressed using Dictionary. Every tag is compressed individually using WAL Dictionary. To - turn ON tag compression in WAL dictionary enable the property - - hbase.regionserver.wal.tags.enablecompression + ]]> + + + Enable SASL and Kerberos authentication for RPC and ZooKeeper, as described in and . + + + +
+ Tags + Tags are a feature of HFile v3. A tag is an piece of metadata + which is part of a cell, separate from the key, value, and version. Tags are an + implementation detail which provides a foundation for other security-related features such + as cell-level ACLs and visibility labels. Tags are stored in the HFiles themselves. It is + possible that in the future, tags will be used to implement other HBase features. You don't + need to know a lot about tags in order to use the security features they enable. + To enable HFile v3, in order to use features that rely on tags, set + to 3 in + hbase-site.xml. +
+ Implementation Details + Every cell can have zero or more tags. Every tag has a type and the actual tag byte + array. + Just as row keys, column families, qualifiers and values can be encoded (see ), tags can also be encoded as well. You can enable + or disable tag encoding at the level of the column family. It is enabled by default. Use + the HColumnDescriptor#setCompressionTags(boolean compressTags) method to + manage encoding settings on a column family. You also need to enable the DataBlockEncoder + for the column family, for encoding of tags to take effect. + You can enable compression of each tag in the WAL, if WAL compression is also enabled, + by setting the value of to + true in hbase-site.xml. Tag compression uses + dictionary encoding. Tag compression is not supported when using WAL encryption. +
+
+ +
+ Access Control Labels (ACLs) +
+ How It Works + ACLs in HBase are based upon a user's membership in or exclusion from groups, and a + given group's permissions to access a given resource. ACLs are implemented as a + coprocessor called AccessController. + The users are stored in a directory such as an LDAP or Active Directory + service. A Hadoop group mapper maps between directory groups and + HBase users. Any supported Hadoop group mapper will work. Users are then + granted specific permissions (Read, Write, Execute, Create, Admin) against resources + (global, namespaces, tables, cells, or endpoints). + + With Kerberos and Access Control enabled, client access to HBase is authenticated + and user data is private unless access has been explicitly granted. + + HBase has a simpler feature set than relational databases, especially in terms of + client operations. No distinction is made between an insert (new record) and update (of + existing record), for example, as both collapse down into a Put. Accordingly, the + important operations condense to four permissions: READ, WRITE, CREATE, and ADMIN. + + Operation To Permission Mapping + + + + + + Permission + Operation + + + + + + Read + Get + + + + Exists + + + + Scan + + + + Write + Put + + + + Delete + + + + IncrementColumnValue + + + + CheckAndDelete/Put + + + + Create + Create + + + + Alter + + + + Drop + + + + Bulk Load + + + + Admin + Enable/Disable + + + + Snapshot/Restore/Clone + + + + Split + + + + Flush + + + + Compact + + + + Major Compact + + + + Grant + + + + Revoke + + + + Shutdown + + + +
+ Permissions can be granted in any of the following scopes, though CREATE and ADMIN + permissions are effective only at table, namespace, and global scopes. + + + + Namespace + + + + Read: User can read any table in the namespace. + + + Write: User can write to table in the namespace. + + + Create: User can create tables in the namespace. + + + Admin: User can alter table attributes; add, alter, or drop column families; + and enable, disable, or drop the table. User can also trigger region + (re)assignments or relocation. + + + + + + Table + + + + Read: User can read from any column family in table + + + Write: User can write to any column family in table + + + Create: User can alter table attributes; add, alter, or drop column + families; and drop the table. + + + Admin: User can alter table attributes; add, alter, or drop column families; + and enable, disable, or drop the table. User can also trigger region + (re)assignments or relocation. + + + + + + Column Family / Column Qualifier / Cell + + + + Read: User can read at the specified scope. + + + Write: User can write at the specified scope. + + + + + + Coprocessor Endpoint + + Execute: the user can execute the coprocessor endpoint. + + + + Global + + The superuser, or list of superusers, is specified as a comma-separated list of + users and groups, in the option in + hbase-site.xml, has global scope in HBase. The superuser is + equivalent to the root user in a UNIX environment. As a minimum, + the superuser should include the principal used to run the HMaster process. Global + admin privileges, which are implicitly granted to the superuser, are required to + create namespaces, switch the balancer on and off, or take other actions with global + consequences. The superuser can also grant all permissions to all resources. + + + + + Tables have a new metadata attribute OWNER, which is the user + principal who owns the table. By default, the owner is the user principal who creates the + table, though it may be changed at table creation time or during an + alter operation by setting or changing the + table attribute. Only a single user principal can own a table at a given time. A table + owner is implicitly granted all permissions to a given table. + + ACL Matrix + For more details on how ACLs map to specific HBase operations and tasks, see . + ACLs can be used together with Visibility Labels. + Cell-level ACLs are implemented using tags (see ). In + order to use cell-level ACLs, you must be using HFile v3 and HBase 0.98. + + ACL Implementation Caveats + + Files created by HBase are owned by the operating system user running the HBase + process. To interact with HBase files, you should use the API or bulk load + facility. + + + HBase does not model "roles" internally in HBase. Instead, group names can be + granted permissions. This allows external modeling of roles via group membership. + Groups are created and manipulated externally to HBase, via the Hadoop group mapping + service. + + + +
+
+ Server-Side Configuration + + + As a prerequisite, perform the steps in . + + Install and configure the AccessController coprocessor, by setting the following + properties in hbase-site.xml. These properties take a list of + classes. + If you use the AccessController along with the VisibilityController, the + AccessController must come first in the list, because with both components active, the + VisibilityController will delegate access control on its system tables to the + AccessController. + + hbase.coprocessor.region.classes + org.apache.hadoop.hbase.security.access.AccessController, org.apache.hadoop.hbase.security.token.TokenProvider + + + hbase.coprocessor.master.classes + org.apache.hadoop.hbase.security.access.AccessController + + + hbase.coprocessor.regionserver.classes + org.apache.hadoop/hbase.security.access.AccessController + + + hbase.security.exec.permission.checks true - ]]> - To add tags to every cell during Puts, the following apis are provided - + ]]> + Optionally, you can enable transport security, by setting + to auth-conf. This requires + HBase 0.98.4 or newer. + + + Set up the Hadoop group mapper in the Hadoop namenode's + core-site.xml. This is a Hadoop file, not an HBase file. + Customize it to your site's needs. Following is an example. + + hadoop.security.group.mapping + org.apache.hadoop.security.LdapGroupsMapping + - Some of the feature developed using tags are Cell level ACLs and Visibility labels. These - are some features that use tags framework and allows users to gain better security features on - cell level. - For details, see: - - Access Control - Visibility labels - -
+ + hadoop.security.group.mapping.ldap.url + ldap://server + -
- Access Control - Newer releases of Apache HBase (>= 0.92) support optional access control list (ACL-) - based protection of resources on a column family and/or table basis. - This describes how to set up Secure HBase for access control, with an example of granting - and revoking user permission on table resources provided. + + hadoop.security.group.mapping.ldap.bind.user + Administrator@example-ad.local + -
- Prerequisites - You must configure HBase for secure or simple user access operation. Refer to the Secure Client Access to HBase or Simple User Access to HBase sections and - complete all of the steps described there. - For secure access, you must also configure ZooKeeper for secure operation. Changes to - ACLs are synchronized throughout the cluster using ZooKeeper. Secure authentication to - ZooKeeper must be enabled or otherwise it will be possible to subvert HBase access control - via direct client access to ZooKeeper. Refer to the section on secure ZooKeeper - configuration and complete all of the steps described there. -
+ + hadoop.security.group.mapping.ldap.bind.password + **** + + + hadoop.security.group.mapping.ldap.base + dc=example-ad,dc=local + + + + hadoop.security.group.mapping.ldap.search.filter.user + (&(objectClass=user)(sAMAccountName={0})) + + + + hadoop.security.group.mapping.ldap.search.filter.group + (objectClass=group) + + + + hadoop.security.group.mapping.ldap.search.attr.member + member + + + + hadoop.security.group.mapping.ldap.search.attr.group.name + cn + + + ]]> + + + Optionally, enable the early-out evaluation strategy. Prior to HBase 0.98.0, if a + user was not granted access to a column family, or at least a column qualifier, an + AccessDeniedException would be thrown. HBase 0.98.0 removed this exception in order to + allow cell-level exceptional grants. To restore the old behavior in HBase 0.98.x, set + to true in + hbase-site.xml. + + + Distribute your configuration and restart your cluster for changes to take + effect. + + + To test your configuration, log into HBase Shell as a given user and use the + whoami command to report the groups your user is part of. In this example, the user is + reported as being a member of the services group. + +hbase> whoami +service (auth:KERBEROS) + groups: services + + + +
+
+ Administration + Administration tasks can be performed from HBase Shell or via an API. + + + User and Group Administration + Users and groups are maintained external to HBase, in your directory. + + + Granting Access To A Namespace, Table, Column Family, or Cell + There are a few different types of syntax for grant statements. The first, and + most familiar, is as follows, with the table and column family being optional: + grant 'user', 'RWXCA', 'TABLE', 'CF', 'CQ' + Groups and users are granted access in the same way, but groups are prefixed with + an @ symbol. In the same way, tables and namespaces are specified + in the same way, but namespacesa re prefixed with an @ + symbol. + It is also possible to specify multiple grants against the same resource in a + single statement, as in this example. The first sub-clause maps users to + ACLs and the second sub-clause specifies the resource. + The following statement grants permissions to the user table, + specifically cells in the pii column whose value starts with the string + test. The developers group is granted Read and Write, and the + testuser user is granted Read. + + HBase Shell support for granting and revoking access is for testing and verification + support, and should not be employed for production use because it won't apply the + permissions to cells that don't exist yet. The correct way to apply cell level + permissions is to do so in the application code when storing the values. + + + HBase Shell + + + Global: + hbase> grant '@admins', 'RWXCA' + + + Namespace: + hbase> grant 'service', 'RWXCA', '@test-NS' + + + Table: + hbase> grant 'service', 'RWXCA', 'user' + + + Column Family: + hbase> grant '@developers', 'RW', 'user', 'i' + + + Column Qualifier: + hbase> grant 'service, 'RW', 'user', 'i', 'foo' + + + Cell: + The syntax for granting cell ACLs uses the following syntax: + grant <table>, \ + { '<user-or-group>' => \ + '<permissions>', ... }, \ + { <scanner-specification> } + + + <user-or-group> is the user or group + name, prefixed with @ in the case of a group. + + + <permissions> is a string containing + any or all of "RWXCA", though nly R and W are meaningful at cell + scope. + + + <scanner-specification> is the scanner + specification syntax and conventions used by the 'scan' shell command. See + hbase-shell/src/main/ruby/shell/commands/scan.rb for + some examples of scanner specifications. + + + This example grants read access to the 'testuser' user and read/write access + to the 'developers' group, on cells in the 'pii' column which match the + filter. + hbase> grant 'user', \ + { '@developers' => 'RW', 'testuser' => 'R' }, \ + { COLUMNS => 'pii', FILTER => "(PrefixFilter ('test'))" } + The shell will run a scanner with the given criteria, rewrite the found + cells with new ACLs, and store them back to their exact coordinates. + + + In addition, the alter command has been extended to allow for a + change in table ownership: + hbase> alter 'tablename', {OWNER => 'username|@group'} + + + API + See the source files + hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java + and + hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/SecureTestUtil.java + for examples of setting and checking ACLs using the Java API. + Neither this example, nor the source file it is taken from, is part of the + public HBase API and is provided for illustration only. Refer to the official API + for usage instructions. + The following example shows how to grant access at the + table level. + () { + @Override + public Void call() throws Exception { + HTable acl = new HTable(util.getConfiguration(), AccessControlLists.ACL_TABLE_NAME); + try { + BlockingRpcChannel service = acl.coprocessorService(HConstants.EMPTY_START_ROW); + AccessControlService.BlockingInterface protocol = + AccessControlService.newBlockingStub(service); + ProtobufUtil.grant(protocol, user, table, family, qualifier, actions); + } finally { + acl.close(); + } + return null; + } + }); +} ]]> + + You can also use the Mutation.setACL method: + perms) + ]]> + + This example provides read permission to a user called + user1: + + + + + Revoking Access Control From a Namespace, Table, Column Family, or Cell + The revoke command and API are twins of the grant command and + API, and the syntax is exactly the same. You can only revoke access that has + previously been granted. A revoke statement is not the same thing + as explicit denial to a resource. + + HBase Shell support for granting and revoking access is for testing and verification + support, and should not be employed for production use because it won't apply the + permissions to cells that don't exist yet. The correct way to apply cell level + permissions is to do so in the application code when storing the values. + + + Revoking Access To a Table + Neither this example, nor the source file it is taken from, is part of the + public HBase API and is provided for illustration only. Refer to the official API + for usage instructions. + +() { + @Override + public Void call() throws Exception { + HTable acl = new HTable(util.getConfiguration(), AccessControlLists.ACL_TABLE_NAME); + try { + BlockingRpcChannel service = acl.coprocessorService(HConstants.EMPTY_START_ROW); + AccessControlService.BlockingInterface protocol = + AccessControlService.newBlockingStub(service); + ProtobufUtil.revoke(protocol, user, table, family, qualifier, actions); + } finally { + acl.close(); + } + return null; + } + }); +} ]]> + + + + + Show a User's Effective Permissions + + HBase Shell + hbase> user_permission 'user' + hbase> user_permission '.*' + hbase> user_permission JAVA_REGEX + + + API + Neither this example, nor the source file it is taken from, is part of the + public HBase API and is provided for illustration only. Refer to the official API + for usage instructions. + ) { + List results = (List) obj; + if (results != null && results.isEmpty()) { + fail("Empty non null results from action for user '" + user.getShortName() + "'"); + } + assertEquals(count, results.size()); + } + } catch (AccessDeniedException ade) { + fail("Expected action to pass for user '" + user.getShortName() + "' but was denied"); + } +} ]]> + + + + Cell-First Strategy + By default, ACLs are evaluated from least granular to most granular, and when an + ACL is reached that grants permission, evaluation stops. If you use cell ACLs and you + want the cell ACL to be evaluated first, you can use the method + Mutation.setACLStrategy(boolean cellFirstStrategy). options. + + +
+
+
- Overview - With Secure RPC and Access Control enabled, client access to HBase is authenticated and - user data is private unless access has been explicitly granted. Access to data can be - granted at a table or per column family basis. - However, the following items have been left out of the initial implementation for - simplicity: - - - Row-level or per value (cell): Using Tags in HFile V3 - - - Push down of file ownership to HDFS: HBase is not designed for the case where files - may have different permissions than the HBase system principal. Pushing file ownership - down into HDFS would necessitate changes to core code. Also, while HDFS file ownership - would make applying quotas easy, and possibly make bulk imports more straightforward, it - is not clear that it would offer a more secure setup. - - - HBase managed "roles" as collections of permissions: We will not model "roles" - internally in HBase to begin with. We instead allow group names to be granted - permissions, which allows external modeling of roles via group membership. Groups are - created and manipulated externally to HBase, via the Hadoop group mapping - service. - - - Access control mechanisms are mature and fairly standardized in the relational database - world. The HBase implementation approximates current convention, but HBase has a simpler - feature set than relational databases, especially in terms of client operations. We don't - distinguish between an insert (new record) and update (of existing record), for example, as - both collapse down into a Put. Accordingly, the important operations condense to four - permissions: READ, WRITE, CREATE, and ADMIN. + Visibility Labels + Visibility labels control can be used to only permit users or principals associated with + a given label to read or access cells with that label. For instance, you might label a cell + top-secret, and only grant access to that label to the + managers group. Visibility labels are implemented using Tags, which are + a feature of HFile v3, and allow you to store metadata on a per-cell basis. A label is a + string, and labels can be combined into expressions by using logical operators (&, |, or + !), and using parentheses for grouping. The | operator is not an + exclusive OR. HBase does not do any kind of validation of expressions beyond basic + well-formedness. Visibility labels have no meaning on their own. They may be used to denote + sensitivity level, privilege level, or any other arbitrary semantic meaning. + If a user's labels do not match a cell's label or expression, the user is + denied access to the cell. + In HBase 0.98.6 and newer,t UTF-8 encoding is supported for visibility labels and + expressions. When creating labels using the addLabels() method and and passing + labels in Authorizations via Scan or Get, labels can contain UTF-8 characters, as well as + the characters containing !,&, | with normal Java notations, without needing any + escaping method. However, when you pass pass a CellVisibility expression via a Mutation, you + must enclose the expression with the CellVisibility.quote() method if you use + UTF-8 characters and characters !,& and |, which are otherwise considered as expression + operators. See TestExpressionParser and the source file + hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestScan.java. + + A user adds visibility expressions to a cell during a Put operation. The user does not + need to access to a label in order to label cells with it, by default. This behavior is + controlled by the configuration option + , If you set this option to + true, the labels the user is modifying as part of the mutation must be + associated with the user, or the mutation will fail. Whether a user is authorized to read a + labelled cell is determined during a Get or Scan, and results which the user is not allowed + to read are filtered out. This incurs the same I/O penalty as if the results were returned, + but reduces load on the network. + Visibility labels can also be specified during Delete operations. For details about + visibility labels and Deletes, see HBASE-10885. + The user's effective label set is built in the RPC context when a request is first + received by the RegionServer. The way that users are associated with labels is pluggable. + The default plugin passes through labels specified in Authorizations added to the Get or + Scan and checks those against the calling user's authenticated labels list. When the client + passes labels for which the user is not authenticated, the default plugin drops them. You + can pass a subset of user authenticated labels via the Get#setAuthorizations(new + Authorizations(String,...)) and Scan#setAuthorizations(new + Authorizations(String,...)); APIs. + Visibility label access checking is performed by the VisibilityController coprocessor. + You can use interface VisibilityLabelService to provide a custom implementation + and/or control the way that visibility labels are stored with cells. See the source file + hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithCustomVisLabService.java + for one example. + + Visibility labels can be used in conjunction with ACLs. - Operation To Permission Mapping - - - + Examples of Visibility Expressions + - Permission - Operation + Expression + Interpretation - - - Read - Get - - - - Exists - - - - Scan - - - - Write - Put - - - - Delete - - - - Lock/UnlockRow - - - - IncrementColumnValue - - - - CheckAndDelete/Put - - - - Create - Create - - - - Alter - - - - Drop - - - - Bulk Load - - - - Admin - Enable/Disable - - - - Snapshot/Restore/Clone - - - - Split - - - - Flush - - - Compact + fulltime + Only allow accesss to users associated with the fulltime + label. - - Major Compact + !public + Allow access to users not associated with the public + label. - - Grant - - - - Revoke - - - - Shutdown + ( secret | topsecret ) & !probationary + The user must be associated with either the secret and/or + topsecret label, and not be associated with the + probationary
- Permissions can be granted in any of the following scopes, though CREATE and ADMIN - permissions are effective only at table scope. +
+ Server-Side Configuration + + + As a prerequisite, perform the steps in . + + Install and configure the VisibilityController coprocessor, by setting the + following properties in hbase-site.xml. These properties take a + list of classes + If you use the AccessController and VisibilityController coprocessors together, + the AccessController must come first in the list, because with both components + active, the VisibilityController will delegate access control on its system tables + to the AccessController. + + hbase.coprocessor.region.classes + org.apache.hadoop.hbase.security.visibility.VisibilityController + + + hbase.coprocessor.master.classes + org.apache.hadoop.hbase.security.visibility.VisibilityController + + ]]> + + + Adjust Configuration + By default, users can label cells with any label, including labels they are not + associated with, which means that a user can Put data that he cannot read. For + example, a user could label a cell with the (hypothetical) 'topsecret' label even if + the user is not associated with that label. If you only want users to be able to label + cells with labels they are associated with, set + hbase.security.visibility.mutations.checkauths to + true. In that case, the mutation will fail if it modifies labels + the user is not associated with. + + + Distribute your configuration and restart your cluster for changes to take + effect. + + +
+
+ Administration + Administration tasks can be performed using the HBase Shell or the Admin API. For + defining the list of visibility labels and associating labels with users, the + HBase Shell is probably simpler. + + + Define the List of Visibility Labels + + HBase Shell + hbase< add_labels [ 'admin', 'service', 'developer', 'test' ] + + + Java API + This example was taken from the source file + hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabels.java. + Refer to that file or the API documentation for more context. + Neither this example, nor the source file it is taken from, is part of the + public HBase API and is provided for illustration only. Refer to the official API + for usage instructions. + action = + new PrivilegedExceptionAction() { + public VisibilityLabelsResponse run() throws Exception { + String[] labels = { SECRET, TOPSECRET, CONFIDENTIAL, PUBLIC, PRIVATE, COPYRIGHT, ACCENT, + UNICODE_VIS_TAG, UC1, UC2 }; + try { + VisibilityClient.addLabels(conf, labels); + } catch (Throwable t) { + throw new IOException(t); + } + return null; + } + }; + SUPERUSER.runAs(action); +} + ]]> + + + + Associate Labels with Users + + HBase Shell + hbase< set_auths 'service', [ 'service' ] + hbase< set_auths 'testuser', [ 'test' ] + hbase< set_auths 'qa', [ 'test', 'developer' ] + + + Java API + This example was taken from the source file + hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabels.java. + Refer to that file or the API documentation for more context. + Neither this example, nor the source file it is taken from, is part of the + public HBase API and is provided for illustration only. Refer to the official API + for usage instructions. + action = new PrivilegedExceptionAction() { + public Void run() throws Exception { + String[] auths = { SECRET, CONFIDENTIAL }; + try { + VisibilityClient.setAuths(conf, auths, user); + } catch (Throwable e) { + } + return null; + } + ... + ]]> + + + + Clear Labels From Users + + HBase Shell + hbase< clear_auths 'service', [ 'service' ] + hbase< clear_auths 'testuser', [ 'test' ] + hbase< clear_auths 'qa', [ 'test', 'developer' ] + + + Java API + This example was taken from the source file + hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabels.java. + Refer to that file or the API documentation for more context. + Neither this example, nor the source file it is taken from, is part of the + public HBase API and is provided for illustration only. Refer to the official API + for usage instructions. + + + + + Apply a Label or Expression to a Cell + The label is only applied when data is written. The label is associated with a + given version of the cell. + + HBase Shell + hbase< set_visibility 'user', 'admin|service|developer', \ + { COLUMNS => 'i' } + hbase< set_visibility 'user', 'admin|service', \ + { COLUMNS => ' pii' } + hbase< COLUMNS => [ 'i', 'pii' ], \ + FILTER => "(PrefixFilter ('test'))" } + + + HBase Shell support for applying labels or permissions to cells is for testing + and verification support, and should not be employed for production use because it + won't apply the labels to cells that don't exist yet. The correct way to apply cell + level labels is to do so in the application code when storing the values. + + + Java API + This example was taken from the source file + hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabels.java. + Refer to that file or the API documentation for more context. + Neither this example, nor the source file it is taken from, is part of the + public HBase API and is provided for illustration only. Refer to the official API + for usage instructions. + puts = new ArrayList(); + for (String labelExp : labelExps) { + Put put = new Put(Bytes.toBytes("row" + i)); + put.add(fam, qual, HConstants.LATEST_TIMESTAMP, value); + put.setCellVisibility(new CellVisibility(labelExp)); + puts.add(put); + i++; + } + table.put(puts); + } finally { + if (table != null) { + table.flushCommits(); + } + } + ]]> + + + +
+
+ Implementing Your Own Visibility Label Algorithm + Interpreting the labels authenticated for a given get/scan request is a pluggable + algorithm. You can specify a custom plugin by using the property + hbase.regionserver.scan.visibility.label.generator.class. The default + implementation class is + org.apache.hadoop.hbase.security.visibility.DefaultScanLabelGenerator. You + can also configure a set of ScanLabelGenerators to be used by the system, as + a comma-separated list. +
+
- - - Table - - - - Read: User can read from any column family in table - - - Write: User can write to any column family in table - - - Create: User can alter table attributes; add, alter, or drop column families; - and drop the table. - - - Admin: User can alter table attributes; add, alter, or drop column families; - and enable, disable, or drop the table. User can also trigger region - (re)assignments or relocation. - - - - - - Column Family - - - - Read: User can read from the column family - - - Write: User can write to the column family - - - - - +
+ Transparent Encryption of Data At Rest + HBase provides a mechanism for protecting your data at rest, in HFiles and the WAL, which + reside within HDFS or another distributed filesystem. A two-tier architecture is used for + flexible and non-intrusive key rotation. "Transparent" means that no implementation changes + are needed at the client side. When data is written, it is encrypted. When it is read, it is + decrypted on demand. +
+ How It Works + The administrator provisions a master key for the cluster, which is stored in a key + provider which is accessible to every trusted HBase process, including the HMaster, + RegionServers, and clients (such as HBase Shell) which reside on administrative + workstations. The default key provider is integrated with the Java KeyStore API and any + key management systems with support for it. Other custom key provider implementations are + possible. The key retrieval mechanism is configured in the + hbase-site.xml configuration file. The master key may be stored on + the cluster servers, protected by a secure KeyStore file, or on an external keyserver, or + in a hardware security module. This master key is resolved as needed by HBase processes + through the configured key provider. + Next, encryption use can be specified in the schema, per Column Family, by creating + or modifying a column descriptor to include two additional attributes: the name of the + encryption algorithm to use (currently only "AES" is supported), and, optionally, a data + key wrapped (encrypted) with the cluster master key. If a data key is not explictly + configured for a ColumnFamily, HBase will create a random data key per HFile. This + provides an incremental improvement in security over the alternative. Unless you need to + supply an explicit data key, such as in a case where you are generating encrypted HFiles + for bulk import with a given data key, only specify the encryption algorithm in the + ColumnFamily schema metadata and let HBase create data keys on demand. Per Column Family + keys facilitate low impact incremental key rotation and reduce the scope of any external + leak of key material. The wrapped data key is stored in the ColumnFamily schema metadata, + and in each HFile for the Column Family, encrypted with the cluster master key. After the + Column Family is configured for encryption, any new HFiles will be written encrypted. To + ensure encryption of all HFiles, trigger a major compaction after enabling this + feature. + When the HFile is opened, the data key is extracted from the HFile, decrypted with the + cluster master key, and used for decryption of the remainder of the HFile. The HFile will + be unreadable if the master key is not available. If a remote user somehow acquires access + to the HFile data because of some lapse in HDFS permissions, or from inappropriately + discarded media, it will not be possible to decrypt either the data key or the file + data. + It is also possible to encrypt the WAL. Even though WALs are transient, it is + necessary to encrypt the WALEdits to avoid circumventing HFile protections for encrypted + column families, in the event that the underlying filesystem is compromised. When WAL + encryption is enabled, all WALs are encrypted, regardless of whether the relevant HFiles + are encrypted. +
+
+ Server-Side Configuration + This procedure assumes you are using the default Java keystore implementation. If you + are using a custom implementation, check its documentation and adjust accordingly. + + + Create a secret key of appropriate length for AES encryption, using the + <command>keytool</command> utility. + $ keytool -keystore /path/to/hbase/conf/hbase.jks \ + -storetype jceks -storepass <password> \ + -genseckey -keyalg AES -keysize 128 \ + -alias <alias> + Replace <password> with the password for the keystore file and <alias> + with the username of the HBase service account, or an arbitrary string. If you use an + arbitrary string, you will need to configure HBase to use it, and that is covered + below. Specify a keysize that is appropriate. Do not specify a separate password for + the key, but press Return when prompted. + + + Set appropriate permissions on the keyfile and distribute ite to all the HBase + servers. + The previous command created a file called hbase.jks in the + HBase conf/ directory. Set the permissions and ownership on this + file such that only the HBase service account user can read the file, and use a + secure mechanism, such as SSH, to distribute the key to all + HBase servers. + + + Configure the HBase daemons. + Set the following properties to configure HBase daemons to use a key provider + backed by the KeyStore file or retrieving the cluster master key. + + hbase.crypto.keyprovider + org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider + + + hbase.crypto.keyprovider.parameters + jceks:///path/to/hbase/conf/hbase.jks?password= + + ]]> + By default, the HBase service account name will be used to resolve the cluster + master key. However, you can store it with an arbitrary alias (in the + keytool command). In that case, set the following property to the + alias you used. + + hbase.crypto.master.key.name + my-alias + + ]]> + You also need to be sure your HFiles use HFile v3, in order to use transparent + encryption. Set the following property in your hbase-site.xml + file. + + hfile.format.version + 3 + + ]]> + Optionally, you can use a different cipher provider, either a Java Cryptography + Encryption (JCE) algorithm provider or a custom HBase cipher implementation. + + + JCE: + + + Install a signed JCE provider (supporting “AES/CTR/NoPadding” mode with + 128 bit keys) + + + Add it with highest preference to the JCE site configuration file + $JAVA_HOME/lib/security/java.security. + + + Update and + options in + hbase-site.xml. + + + + + Custom HBase Cipher: + + + Implement + org.apache.hadoop.hbase.io.crypto.CipherProvider. + + + Add the implementation to the server classpath. + + + Update in + hbase-site.xml. + + + + + + + Configure WAL encryption. + Configure WAL encryption in every RegionServer's + hbase-site.xml, by setting the following properties. You can + include these in the HMaster's hbase-site.xml as well, but the + HMaster does not have a WAL and will not use them. + + hbase.regionserver.hlog.reader.impl + org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader + + + hbase.regionserver.hlog.writer.impl + org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter + + + hbase.regionserver.wal.encryption + true + + ]]> + + + Configure permissions on the <filename>hbase-site.xml</filename> file. + Because the keystore password is stored in the hbase-site.xml, you need to ensure + that only the HBase user can read the hbase-site.xml file, using + file ownership and permissions. Also, whenever you distribute the configuration to + your cluster nodes, be sure to use a secure means of transport, such as + ssh. + + + Distribute the new configuration and restart your cluster. + Distribute the new configuration file using a secure mechanism, and restart your + cluster. + + +
+
+ Administration + + + Enable Encryption on a Column Family + + To enable encryption on a column family, you can either use HBase Shell or the + Java API. After enablin encryption, trigger a major compaction. When the major + compaction completes, the HFiles will be encrypted. + + HBase Shell + +hbase> disable 'mytable' +hbase> alter 'mytable', 'mycf', {ENCRYPTION => AES} +hbase> enable 'mytable' + + + + Java API + You can use the HBaseAdmin#modifyColumn API to modify the + ENCRYPTION attribute on a Column Family. Additionally, you + can specify the specific key to use as the wrapper, by setting the + ENCRYPTION_KEY attribute. This is only possible via the + API, and not the HBase Shell. + The following example is taken from the source file + hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckEncryption.java, + and shows how to programmatically set the transparent encryption both in the + server configuration and at the column family, as part of a test which uses the + Minicluster configuration.. + Neither this example, nor the source file it is taken from, is part of the + public HBase API and is provided for illustration only. Refer to the official API + for usage instructions. + +@Before +public void setUp() throws Exception { + conf = TEST_UTIL.getConfiguration(); + conf.setInt("hfile.format.version", 3); + conf.set(HConstants.CRYPTO_KEYPROVIDER_CONF_KEY, KeyProviderForTesting.class.getName()); + conf.set(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, "hbase"); + + // Create the test encryption key + SecureRandom rng = new SecureRandom(); + byte[] keyBytes = new byte[AES.KEY_LENGTH]; + rng.nextBytes(keyBytes); + cfKey = new SecretKeySpec(keyBytes, "AES"); + + // Start the minicluster + TEST_UTIL.startMiniCluster(3); - There is also an implicit global scope for the superuser. - The superuser is a principal, specified in the HBase site configuration file, that has - equivalent access to HBase as the 'root' user would on a UNIX derived system. Normally this - is the principal that the HBase processes themselves authenticate as. Although future - versions of HBase Access Control may support multiple superusers, the superuser privilege - will always include the principal used to run the HMaster process. Only the superuser is - allowed to create tables, switch the balancer on or off, or take other actions with global - consequence. Furthermore, the superuser has an implicit grant of all permissions to all - resources. - Tables have a new metadata attribute: OWNER, the user principal who owns the table. By - default this will be set to the user principal who creates the table, though it may be - changed at table creation time or during an alter operation by setting or changing the OWNER - table attribute. Only a single user principal can own a table at a given time. A table owner - will have all permissions over a given table. + // Create the table + htd = new HTableDescriptor(TableName.valueOf("default", "TestHBaseFsckEncryption")); + HColumnDescriptor hcd = new HColumnDescriptor("cf"); + hcd.setEncryptionType("AES"); + hcd.setEncryptionKey(EncryptionUtil.wrapKey(conf, + conf.get(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, User.getCurrent().getShortName()), + cfKey)); + htd.addFamily(hcd); + TEST_UTIL.getHBaseAdmin().createTable(htd); + TEST_UTIL.waitTableAvailable(htd.getName(), 5000); +} + + + + + + Rotate the Data Key + + To rotate the data key, first change the ColumnFamily key in the column + descriptor, then trigger a major compaction. When compaction is complete, all HFiles + will be re-encrypted using the new data key. Until the compaction completes, the + old HFiles will still be readable using the old key. + + + + Rotate the Master Key + + To rotate the master key, first generate and distribute the new key. Then update + the KeyStore to contain a new master key, and keep the old master key in the + KeyStore using a different alias. Next, configure fallback to the old master key in + the hbase-site.xml file. + + hbase.crypto.master.alternate.key.name + hbase.old + + ]]> + Rolling restart your cluster for this change to take effect. Trigger a major + compaction on each table. At the end of the major compaction, all HFiles will be + re-encrypted with data keys wrapped by the new cluster key. At this point, you can + remove the old master key from the KeyStore, remove the configuration for the + fallback master key from the hbase-site.xml, and perform a + second rolling restart at some point. This second rolling restart is not + time-sensitive. + + + + + + + + + +
-
- Access Control Matrix - The following matrix shows the minimum permission set required to perform operations in - HBase. Before using the table, read through the information about how to interpret it. - - Interpreting the ACL Matrix Table - The following conventions are used in the ACL Matrix table: - - Scopes - - Permissions are evaluated starting at the widest scope and working to the - narrowest scope. A scope corresponds to a level of the data model. From broadest to - narrowest, the scopes are as follows:: - - Global - Namespace (NS) - Table - Column Qualifier (CF) - Column Family (CQ) - Cell - - For instance, a permission granted at table level dominates any grants done at the - ColumnFamily, ColumnQualifier, or cell level. The user can do what that grant implies - at any location in the table. A permission granted at global scope dominates all: the - user is always allowed to take that action everywhere. - - - - Permissions - - Possible permissions include the following: - - Superuser - a special user that belongs to group "supergroup" and has - unlimited access - Admin (A) - Create (C) - Write (W) - Read (R) - Execute (X) - - - - - For the most part, permissions work in an expected way, with the following caveats: +
+ Secure Bulk Load + Bulk loading in secure mode is a bit more involved than normal setup, since the client + has to transfer the ownership of the files generated from the mapreduce job to HBase. Secure + bulk loading is implemented by a coprocessor, named SecureBulkLoadEndpoint, which uses a staging directory configured by the + configuration property hbase.bulkload.staging.dir, which defaults to + /tmp/hbase-staging/. + Secure Bulk Load Algorithm - Having Write permission does not imply Read permission. It is possible and sometimes - desirable for a user to be able to write data that same user cannot read. One such example - is a log-writing process. - - - Admin is a superset of Create, so a user with Admin permissions does not also need - Create permissions to perform an action such as creating a table. - - - The hbase:meta table is readable by every user, regardless - of the user's other grants or restrictions. This is a requirement for HBase to - function correctly. + One time only, create a staging directory which is world traversable and owned by + the user which runs HBase (mode 711, or rwx--x--x). A listing of this + directory will look similar to the following:. + $ ls -ld /tmp/hbase-staging +drwx--x--x 2 hbase hbase 68 3 Sep 14:54 /tmp/hbase-staging + - Users with Create or Admin permissions are granted Write permission on meta regions, - so the table operations they are allowed to perform can complete, even if technically - the bits can be granted separately in any possible combination. + A user writes out data to a secure output directory owned by that user. For example, + /user/foo/data. - CheckAndPut and CheckAndDelete operations will fail if the user does not have both - Write and Read permission. + Internally, HBase creates a secret staging directory which is globally + readable/writable (-rwxrwxrwx, 777). For example, + /tmp/hbase-staging/averylongandrandomdirectoryname. The name and + location of this directory is not exposed to the user. HBase manages creation and + deletion of this directory. - Increment and Append operations do not require Read access. + The user makes the data world readable and writable, moves it into the random + staging directory, then calls the bulkLoadHFiles() method. - - The following table is sorted by the interface that provides each operation. In case the - table goes out of date, the unit tests which check for accuracy of permissions can be found - in - hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java, - and the access controls themselves can be examined in - hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java. - - - ACL Matrix - - - - Interface - Operation - Minimum Scope - Minimum Permission - - - - - - - Master - - - createTable - - - Global - - - A - - - - - modifyTable - - - Table - - - A|CW - - - - - deleteTable - - - Table - - - A|CW - - - - - truncateTable - - - Table - - - A|CW - - - - - addColumn - - - Table - - - A|CW - - - - - modifyColumn - - - Table - - - A|CW - - - - - deleteColumn - - - Table - - - A|CW - - - - - disableTable - - - Table - - - A|CW - - - - - disableAclTable - - - None - - - Not allowed - - - - - enableTable - - - Table - - - A|CW - - - - - move - - - Global - - - A - - - - - assign - - - Global - - - A - - - - - unassign - - - Global - - - A - - - - - regionOffline - - - Global - - - A - - - - - balance - - - Global - - - A - - - - - balanceSwitch - - - Global - - - A - - - - - shutdown - - - Global - - - A - - - - - stopMaster - - - Global - - - A - - - - - snapshot - - - Global - - - A - - - - - clone - - - Global - - - A - - - - - restore - - - Global - - - A - - - - - deleteSnapshot - - - Global - - - A - - - - - createNamespace - - - Global - - - A - - - - - deleteNamespace - - - Namespace - - - A - - - - - modifyNamespace - - - Namespace - - - A - - - - - flushTable - - - Table - - - A|CW - - - - - getTableDescriptors - - - Global|Table - - - A - - - - - mergeRegions - - - Global - - - A - - - - Region - - preOpen - Global - A - - - - openRegion - - - Global - - - A - - - - preClose - Global - A - - - - closeRegion - - - Global - - - A - - - - preStopRegionServer - Global - A - - - - stopRegionServer - - - Global - - - A - - - - - mergeRegions - - - Global - - - A - - - - append - Table - W - - - delete - Table|CF|CQ - W - - - exists - Table|CF|CQ - R - - - get - Table|CF|CQ - R - - - getClosestRowBefore - Table|CF|CQ - R - - - increment - Table|CF|CQ - W - - - put - Table|CF|CQ - W - - - - flush - - - Global - - - A|CW - - - - - split - - - Global - - - A - - - - - compact - - - Global - - - A|CW - - - - bulkLoadHFile - Table - W - - - prepareBulkLoad - Table - CW - - - cleanupBulkLoad - Table - W - - - checkAndDelete - Table|CF|CQ - RW - - - checkAndPut - Table|CF|CQ - RW - - - incrementColumnValue - Table|CF|CQ - RW - - - ScannerClose - Table - R - - - ScannerNext - Table - R - - - ScannerOpen - Table|CQ|CF - R - - - - Endpoint - - - invoke - - Endpoint - - X - - - - - AccessController - - - grant - - Global|Table|NS - - A - - - - - revoke - - Global|Table|NS - - A - - - - - userPermissions - - - Global|Table|NS - - - A - - - - - checkPermissions - - - Global|Table|NS - - - A - - - - -
-
- -
- Server-side Configuration for Access Control - Enable the AccessController coprocessor in the cluster configuration and restart HBase. - The restart can be a rolling one. Complete the restart of all Master and RegionServer - processes before setting up ACLs. - To enable the AccessController, modify the hbase-site.xml file on every - server machine in the cluster to look like: + Like delegation tokens, the strength of the security lies in the length and randomness + of the secret directory. + + To enable secure bulk load, add the following properties to + hbase-site.xml. - hbase.coprocessor.master.classes - org.apache.hadoop.hbase.security.access.AccessController + hbase.bulkload.staging.dir + /tmp/hbase-staging -hbase.coprocessor.region.classes + hbase.coprocessor.region.classes org.apache.hadoop.hbase.security.token.TokenProvider, - org.apache.hadoop.hbase.security.access.AccessController + org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint ]]>
+ +
-
- Cell level Access Control using Tags - Prior to HBase 0.98 access control was restricted to table and column family level. - Thanks to tags feature in 0.98 that allows Access control on a cell level. The existing - Access Controller coprocessor helps in achieving cell level access control also. For details - on configuring it refer to Access Control section. - The ACLs can be specified for every mutation using the APIs - perms) - ]]> - For example, to provide read permission to an user ‘user1’ then - - Generally the ACL applied on the table and CF takes precedence over Cell level ACL. In - order to make the cell level ACL to take precedence use the following API, - - Please note that inorder to use this feature, HFile V3 version should be turned on. +
+ Security Configuration Example + This configuration example includes support for HFile v3, ACLs, Visibility Labels, and + transparent encryption of data at rest and the WAL. All options have been discussed separately + in the sections above. + + Example Security Settings in <filename>hbase-site.xml</filename> hfile.format.version 3 - ]]> - Note that deletes with ACLs do not have any effect. To keep things simple the ACLs - applied on the current Put does not change the ACL of any previous Put in the sense that the - ACL on the current put does not affect older versions of Put for the same row. -
-
- Shell Enhancements for Access Control - The HBase shell has been extended to provide simple commands for editing and updating - user permissions. The following commands have been added for access control list management: - - - - Grant - [ [ [ ] ] ] - ]]> - - - - <user|@group> is user or group (start with character '@'), Groups are - created and manipulated via the Hadoop group mapping service. - - <permissions> is zero or more letters from the set "RWCA": READ('R'), - WRITE('W'), CREATE('C'), ADMIN('A'). - Note: Grants and revocations of individual permissions on a resource are both - accomplished using the grant command. A separate revoke command is - also provided by the shell, but this is for fast revocation of all of a user's access rights - to a given resource only. - - Revoke - - [
[ [ ] ] ] - ]]> - - - Alter - - The alter command has been extended to allow ownership - assignment: - 'username|@group'} -]]> - - - - User Permission - - The user_permission command shows all access permissions for the current - user for a given table: - - ]]> - - - - - - -
- Secure Bulk Load - Bulk loading in secure mode is a bit more involved than normal setup, since the client - has to transfer the ownership of the files generated from the mapreduce job to HBase. Secure - bulk loading is implemented by a coprocessor, named SecureBulkLoadEndpoint. - SecureBulkLoadEndpoint uses a staging directory "hbase.bulkload.staging.dir", - which defaults to /tmp/hbase-staging/. The algorithm is as follows. - - - Create an hbase owned staging directory which is world traversable (-rwx--x--x, - 711) /tmp/hbase-staging. - - - A user writes out data to his secure output directory: /user/foo/data - - - A call is made to hbase to create a secret staging directory which is globally - readable/writable (-rwxrwxrwx, 777): - /tmp/hbase-staging/averylongandrandomdirectoryname - - - The user makes the data world readable and writable, then moves it into the random - staging directory, then calls bulkLoadHFiles() - - - Like delegation tokens the strength of the security lies in the length and randomness of - the secret directory. - - You have to enable the secure bulk load to work properly. You can modify the - hbase-site.xml file on every server machine in the cluster and add the - SecureBulkLoadEndpoint class to the list of regionserver coprocessors: - - hbase.bulkload.staging.dir - /tmp/hbase-staging + hbase.superuser + hbase, admin + hbase.coprocessor.region.classes - org.apache.hadoop.hbase.security.token.TokenProvider, - org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint + org.apache.hadoop.hbase.security.access.AccessController, + org.apache.hadoop.hbase.security.visibility.VisibilityController, + org.apache.hadoop.hbase.security.token.TokenProvider - ]]> -
- - -
- Visibility Labels - This feature provides cell level security with labeled visibility for the cells. Cells - can be associated with a visibility expression. The visibility expression can contain labels - joined with logical expressions '&', '|' and '!'. Also using - '(', ')' one can specify the precedence order. For example, consider the label - set { confidential, secret, topsecret, probationary }, where the first three are sensitivity - classifications and the last describes if an employee is probationary or not. If a cell is - stored with this visibility expression: ( secret | topsecret ) & !probationary - Then any user associated with the secret or topsecret label will be able to view the - cell, as long as the user is not also associated with the probationary label. Furthermore, any - user only associated with the confidential label, whether probationary or not, will not see - the cell or even know of its existence. - Visibility expressions like the above can be added when storing or mutating a cell using - the API, - Mutation#setCellVisibility(new CellVisibility(String labelExpession)); - Where the labelExpression could be '( secret | topsecret ) & !probationary' - We build the user's label set in the RPC context when a request is first received by - the HBase RegionServer. How users are associated with labels is pluggable. The default plugin - passes through labels specified in Authorizations added to the Get or Scan and checks those - against the calling user's authenticated labels list. When client passes some labels for - which the user is not authenticated, this default algorithm will drop those. One can pass a - subset of user authenticated labels via the Scan/Get authorizations. - Get#setAuthorizations(new Authorizations(String,...)); - Scan#setAuthorizations(new Authorizations(String,...)); - -
- Visibility Label Administration - There are new client side Java APIs and shell commands for performing visibility labels - administrative actions. Only the HBase super user is authorized to perform these operations. - -
- Adding Labels - A set of labels can be added to the system either by using the Java API - VisibilityClient#addLabels(Configuration conf, final String[] - labels) - Or by using the shell command - add_labels [label1, label2] - Valid label can include alphanumeric characters and characters '-', - '_', ':', '.' and '/' -
- -
- User Label Association - A set of labels can be associated with a user by using the API - VisibilityClient#setAuths(Configuration conf, final String[] auths, final String - user) - Or by using the shell command - set_auths user,[label1, label2]. - Labels can be disassociated from a user using API - VisibilityClient#clearAuths(Configuration conf, final String[] auths, final - String user) - Or by using shell command - clear_auths user,[label1, label2] - One can use the API VisibilityClient#getAuths(Configuration conf, final String - user) or get_auths shell command to get the list of labels - associated for a given user. The labels and user auths information will be stored in the - system table "labels". -
-
- -
- Server Side Configuration - HBase stores cell level labels as cell tags. HFile version 3 adds the cell tags - support. Be sure to use HFile version 3 by setting this property in every server site - configuration file: - - hfile.format.version - 3 - - ]]> - You will also need to make sure the VisibilityController coprocessor is active on every - table to protect by adding it to the list of system coprocessors in the server site - configuration files: - hbase.coprocessor.master.classes -org.apache.hadoop.hbase.security.visibility.VisibilityController + org.apache.hadoop.hbase.security.access.AccessController, + org.apache.hadoop.hbase.security.visibility.VisibilityController - hbase.coprocessor.region.classes -org.apache.hadoop.hbase.security.visibility.VisibilityController - - ]]> - As said above, finding out labels authenticated for a given get/scan request is a - pluggable algorithm. A custom implementation can be plugged in using the property - hbase.regionserver.scan.visibility.label.generator.class. The default - implementation class is - org.apache.hadoop.hbase.security.visibility.DefaultScanLabelGenerator. One - can configure a set of ScanLabelGenerators to be used by the system. For this, a comma - separated set of implementation class names to be configured. - - Visibility Labels and Replication - By default, visibility labels are lost on replication. To change this behavior, see - . - -
-
- -
- Transparent Server Side Encryption - This feature provides transparent encryption for protecting HFile and WAL data at rest, - using a two-tier key architecture for flexible and non-intrusive key rotation. - First, the administrator provisions a cluster master key, stored into a key provider - accessable to every trusted HBase process: the Master, the RegionServers, and clients (e.g. - the shell) on administrative workstations. The default key provider integrates with the Java - KeyStore API and any key management system with support for it. How HBase retrieves key - material is configurable via the site file. The master key may be stored on the cluster - servers, protected by a secure KeyStore file, or on an external keyserver, or in a hardware - security module. This master key is resolved as needed by HBase processes through the - configured key provider. - Then, encryption keys can be specified in schema on a per column family basis, by - creating or modifying a column descriptor to include two additional attributes: the name of - the encryption algorithm to use (currently only "AES" is supported), and, optionally, a data - key wrapped (encrypted) with the cluster master key. Per CF keys facilitates low impact - incremental key rotation and reduces the scope of any external leak of key material. The - wrapped data key is stored in the CF schema metadata, and in each HFile for the CF, encrypted - with the cluster master key. Once the CF is configured for encryption, any new HFiles will be - written encrypted. To insure encryption of all HFiles, trigger a major compaction after first - enabling this feature. The key for decryption, encrypted with the cluster master key, is - stored in the HFiles in a new meta block. At file open time the data key will be extracted - from the HFile, decrypted with the cluster master key, and used for decryption of the - remainder of the HFile. The HFile will be unreadable if the master key is not available. - Should remote users somehow acquire access to the HFile data because of some lapse in HDFS - permissions or from inappropriately discarded media, there will be no means to decrypt either - the data key or the file data. - Specifying a data key in the CF schema is optional. If one is not present, a random data - key will be created for each HFile. - A new configuration option for encrypting the WAL is also introduced. Even though WALs - are transient, it is necessary to encrypt the WALEdits to avoid circumventing HFile - protections for encrypted column families. -
- Configuration - Create a secret key of appropriate length for AES. - \ - -genseckey -keyalg AES -keysize 128 \ - -alias - ]]> - where <password> is the password for the KeyStore file and <alias>is the - user name of the HBase service account, typically "hbase". Simply press RETURN to store the - key with the same password as the store. The resulting file should be distributed to all - nodes running HBase daemons, with file ownership and permissions set to be readable only by - the HBase service account. - Configure HBase daemons to use a key provider backed by the KeyStore files for - retrieving the cluster master key as needed. - hbase.coprocessor.regionserver.classes + org.apache.hadoop/hbase.security.access.AccessController, + org.apache.hadoop.hbase.security.access.VisibilityController + + + + hbase.security.exec.permission.checks + true + + + + hbase.security.visibility.mutations.checkauth + false + + + + hbase.rpc.protection + auth-conf + + hbase.crypto.keyprovider org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider @@ -1698,27 +1820,11 @@ $ keytool -keystore /path/to/hbase/conf/hbase.jks \ hbase.crypto.keyprovider.parameters jceks:///path/to/hbase/conf/hbase.jks?password= - ]]> - By default the HBase service account name will be used to resolve the cluster master - key, but you can store it with any arbitrary alias and configure HBase appropriately: - hbase.crypto.master.key.name hbase - - ]]> - Because the password to the key store is sensitive information, the HBase site XML file - should also have its permissions set to be readable only by the HBase service account. - Transparent encryption is a feature of HFile version 3. Be sure to use HFile version 3 - by setting this property in every server site configuration file: - - hfile.format.version - 3 - - ]]> - Finally, configure the secure WAL in every server site configuration file: - + hbase.regionserver.hlog.reader.impl org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader @@ -1730,54 +1836,73 @@ $ keytool -keystore /path/to/hbase/conf/hbase.jks \ hbase.regionserver.wal.encryption true + + + + hbase.crypto.master.alternate.key.name + hbase.old + + + + hbase.bulkload.staging.dir + /tmp/hbase-staging + + + hbase.coprocessor.region.classes + org.apache.hadoop.hbase.security.token.TokenProvider, + org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint ]]> -
-
- Setting Encryption on a CF - To enable encryption on a CF, use HBaseAdmin#modifyColumn or the HBase - shell to modify the column descriptor. The attribute 'ENCRYPTION' specifies the encryption - algorithm to use. Currently only "AES" is supported. If creating a new table, simply set - this attribute; no subsequent table modification will be necessary. - If setting a specific data key, the attribute 'ENCRYPTION_KEY' should contain the data - key wrapped by the cluster master key. The static methods wrapKey and - unwrapKey in org.apache.hadoop.hbase.security.EncryptionUtil can - be used in conjunction with HColumnDescriptor#setEncryptionKey for this - purpose. Because this must be done programatically, setting a data key with the shell is not - supported. - To disable encryption on a CF, simply remove the 'ENCRYPTION' (and 'ENCRYPTION_KEY', if - it was set) attributes from the column schema, using HBaseAdmin#modifyColumn or - the HBase shell. All new HFiles for the CF will be written without encryption. Trigger a - major compaction to rewrite all files. -
-
- Data Key Rotation - Data key rotation is made simple by this design. First, change the CF key in the column - descriptor. Then, trigger major compaction. Once compaction has completed, all files will be - (re)encrypted with the new key material. While this process is ongoing, HFiles encrypted - with old key material will still be readable. -
-
- Master Key Rotation - Master key rotation can be achieved by updating the KeyStore to contain a new master - key, as described above, with also the old master key added to the KeyStore under a - different alias. Then, configure fallback to the old master key in the HBase site file: + + + Example Group Mapper in Hadoop <filename>core-site.xml</filename> + hadoop.security.group.mapping + org.apache.hadoop.security.LdapGroupsMapping + + - hbase.crypto.master.alternate.key.name - hbase.old + hadoop.security.group.mapping.ldap.url + ldap://server + + + + hadoop.security.group.mapping.ldap.bind.user + Administrator@example-ad.local + + + + hadoop.security.group.mapping.ldap.bind.password + **** + + + + hadoop.security.group.mapping.ldap.base + dc=example-ad,dc=local + + + + hadoop.security.group.mapping.ldap.search.filter.user + (&(objectClass=user)(sAMAccountName={0})) + + + + hadoop.security.group.mapping.ldap.search.filter.group + (objectClass=group) + + + + hadoop.security.group.mapping.ldap.search.attr.member + member + + + + hadoop.security.group.mapping.ldap.search.attr.group.name + cn ]]> - This will require a rolling restart of the HBase daemons to take effect. As with data - key rotation, trigger a major compaction and wait for it to complete. Once compaction has - completed, all files will be (re)encrypted with data keys wrapped by the new cluster master - key. The old master key, and its associated site file configuration, can then be removed, - and all trace of the old master key will be gone after the next rolling restart. A second - rolling restart is not immediately necessary. -
+
diff --git src/main/site/resources/images/LDAPScanLabelGenerator.png src/main/site/resources/images/LDAPScanLabelGenerator.png new file mode 100644 index 0000000..4fb67a5 Binary files /dev/null and src/main/site/resources/images/LDAPScanLabelGenerator.png differ