diff --git src/main/docbkx/appendix_acl_matrix.xml src/main/docbkx/appendix_acl_matrix.xml
new file mode 100644
index 0000000..4253cbe
--- /dev/null
+++ src/main/docbkx/appendix_acl_matrix.xml
@@ -0,0 +1,652 @@
+
+
+
+
+ Access Control Matrix
+ The following matrix shows the minimum permission set required to perform operations in
+ HBase. Before using the table, read through the information about how to interpret it.
+
+ Interpreting the ACL Matrix Table
+ The following conventions are used in the ACL Matrix table:
+
+ Scopes
+
+ Permissions are evaluated starting at the widest scope and working to the
+ narrowest scope. A scope corresponds to a level of the data model. From broadest to
+ narrowest, the scopes are as follows::
+
+ Global
+ Namespace (NS)
+ Table
+ Column Family (CF)
+ Column Qualifier (CQ)
+ Cell
+
+ For instance, a permission granted at table level dominates any grants done at
+ the Column Family, Column Qualifier, or cell level. The user can do what that
+ grant implies at any location in the table. A permission granted at global scope
+ dominates all: the user is always allowed to take that action everywhere.
+
+
+
+ Permissions
+
+ Possible permissions include the following:
+
+ Superuser - a special user that belongs to group "supergroup" and has
+ unlimited access
+ Admin (A)
+ Create (C)
+ Write (W)
+ Read (R)
+ Execute (X)
+
+
+
+
+
+ For the most part, permissions work in an expected way, with the following caveats:
+
+
+ Having Write permission does not imply Read permission. It is possible and sometimes
+ desirable for a user to be able to write data that same user cannot read. One such example
+ is a log-writing process.
+
+
+ The hbase:meta table is readable by every user, regardless
+ of the user's other grants or restrictions. This is a requirement for HBase to
+ function correctly.
+
+
+ CheckAndPut and CheckAndDelete operations will fail if the user does not have both
+ Write and Read permission.
+
+
+ Increment and Append operations do not require Read access.
+
+
+
+ The following table is sorted by the interface that provides each operation. In case the
+ table goes out of date, the unit tests which check for accuracy of permissions can be found
+ in
+ hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java,
+ and the access controls themselves can be examined in
+ hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java.
+
+
+ ACL Matrix
+
+
+
+ Interface
+ Operation
+ Minimum Scope
+ Minimum Permission
+
+
+
+
+
+
+ Master
+
+
+ createTable
+
+
+ Global
+
+
+ A
+
+
+
+
+ modifyTable
+
+
+ Table
+
+
+ A|C
+
+
+
+
+ deleteTable
+
+
+ Table
+
+
+ A|C
+
+
+
+
+ truncateTable
+
+
+ Table
+
+
+ A|C
+
+
+
+
+ addColumn
+
+
+ Table
+
+
+ A|C
+
+
+
+
+ modifyColumn
+
+
+ Table
+
+
+ A|C
+
+
+
+
+ deleteColumn
+
+
+ Table
+
+
+ A|C
+
+
+
+
+ disableTable
+
+
+ Table
+
+
+ A|C
+
+
+
+
+ disableAclTable
+
+
+ None
+
+
+ Not allowed
+
+
+
+
+ enableTable
+
+
+ Table
+
+
+ A|C
+
+
+
+
+ move
+
+
+ Global
+
+
+ A
+
+
+
+
+ assign
+
+
+ Global
+
+
+ A
+
+
+
+
+ unassign
+
+
+ Global
+
+
+ A
+
+
+
+
+ regionOffline
+
+
+ Global
+
+
+ A
+
+
+
+
+ balance
+
+
+ Global
+
+
+ A
+
+
+
+
+ balanceSwitch
+
+
+ Global
+
+
+ A
+
+
+
+
+ shutdown
+
+
+ Global
+
+
+ A
+
+
+
+
+ stopMaster
+
+
+ Global
+
+
+ A
+
+
+
+
+ snapshot
+
+
+ Global
+
+
+ A
+
+
+
+
+ clone
+
+
+ Global
+
+
+ A
+
+
+
+
+ restore
+
+
+ Global
+
+
+ A
+
+
+
+
+ deleteSnapshot
+
+
+ Global
+
+
+ A
+
+
+
+
+ createNamespace
+
+
+ Global
+
+
+ A
+
+
+
+
+ deleteNamespace
+
+
+ Namespace
+
+
+ A
+
+
+
+
+ modifyNamespace
+
+
+ Namespace
+
+
+ A
+
+
+
+
+ flushTable
+
+
+ Table
+
+
+ A|C
+
+
+
+
+ getTableDescriptors
+
+
+ Global|Table
+
+
+ A
+
+
+
+
+ mergeRegions
+
+
+ Global
+
+
+ A
+
+
+
+ Region
+
+ open
+ Global
+ A
+
+
+
+ openRegion
+
+
+ Global
+
+
+ A
+
+
+
+ close
+ Global
+ A
+
+
+
+ closeRegion
+
+
+ Global
+
+
+ A
+
+
+
+
+ stopRegionServer
+
+
+ Global
+
+
+ A
+
+
+
+
+ mergeRegions
+
+
+ Global
+
+
+ A
+
+
+
+ append
+ Table|CF|CQ
+ W
+
+
+ delete
+ Table|CF|CQ|Cell (if the user has write permission for all cells)
+ W
+
+
+ exists
+ Table|CF|CQ
+ R
+
+
+ get
+ Table|CF|CQ
+ R
+
+
+ getClosestRowBefore
+ Table|CF|CQ
+ R
+
+
+ increment
+ Table|CF|CQ
+ W
+
+
+ put
+ Table|CF|CQ
+ W
+
+
+
+ flush
+
+
+ Global|Table
+
+
+ A|C
+
+
+
+
+ split
+
+
+ Global|Table
+
+
+ A
+
+
+
+
+ compact
+
+
+ Global|Table
+
+
+ A|C
+
+
+
+ bulkLoadHFile
+ Table
+ W
+
+
+ prepareBulkLoad
+ Table
+ C
+
+
+ cleanupBulkLoad
+ Table
+ W
+
+
+ checkAndDelete
+ Table|CF|CQ
+ RW
+
+
+ checkAndPut
+ Table|CF|CQ
+ RW
+
+
+ incrementColumnValue
+ Table|CF|CQ
+ RW
+
+
+ scannerClose
+ Table
+ R
+
+
+ scannerNext
+ Table
+ R
+
+
+ scannerOpen
+ Table|CQ|CF
+ R
+
+
+
+ Endpoint
+
+
+ invoke
+
+ Endpoint
+
+ X
+
+
+
+
+ AccessController
+
+
+ grant
+
+ Global|Table|NS
+
+ A
+
+
+
+
+ revoke
+
+ Global|Table|NS
+
+ A
+
+
+
+
+ getUserPermissions
+
+
+ Global|Table|NS
+
+
+ A
+
+
+
+
+ checkPermissions
+
+
+ Global|Table|NS
+
+
+ A
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git src/main/docbkx/book.xml src/main/docbkx/book.xml
index 8fc2f7a..8d9fc54 100644
--- src/main/docbkx/book.xml
+++ src/main/docbkx/book.xml
@@ -5293,6 +5293,7 @@ This option should not normally be used, and it is not in -fixAll.
+
@@ -5358,7 +5359,7 @@ This option should not normally be used, and it is not in -fixAll.
-
+ Data Block Encoding TypesPrefix - Often, keys are very similar. Specifically, keys often share a common prefix
diff --git src/main/docbkx/security.xml src/main/docbkx/security.xml
index c74af9b..2489276 100644
--- src/main/docbkx/security.xml
+++ src/main/docbkx/security.xml
@@ -463,1233 +463,1326 @@ grant 'rest_server', 'RWCA'
-
- Tags
- Every cell can have metadata associated with it. Adding metadata in the data part of
- every cell would make things difficult.
- The 0.98 version of HBase solves this problem by providing Tags along with the cell
- format. Some of the usecases that uses the tags are Visibility labels, Cell level ACLs, etc.
- HFile V3 version from 0.98 onwards supports tags and this feature can be turned on using
- the following configuration
-
+ Securing Access To Your Data
+ After you have configured secure authentication between HBase client and server processes
+ and gateways, you need to consider the security of your data itself. HBase provides several
+ strategies for securing your data:
+
+
+ Role-based Access Control (RBAC) controls which users or groups can read and write to
+ a given HBase resource or execute a coprocessor endpoint, using the familiar paradigm of
+ roles.
+
+
+ Visibility Labels which allow you to label cells and control access to labelled cells,
+ to further restrict who can read or write to certain subsets of your data. Visibility
+ labels are stored as tags. See for more information.
+
+
+ Transparent encryption of data at rest on the underlying filesystem, both in HFiles
+ and in the WAL. This protects your data at rest from an attacker who has access to the
+ underlying filesystem, without the need to change the implementation of the client. It can
+ also protect against data leakage from improperly disposed disks, which can be important
+ for legal and regulatory compliance.
+
+
+ Server-side configuration, administration, and implementation details of each of these
+ features are discussed below, along with any performance trade-offs. An example security
+ configuration is given at the end, to show these features all used together, as they might be
+ in a real-world scenario.
+
+ All aspects of security in HBase are in active development and evolving rapidly. Any strategy you employ
+ for security of your data should be thoroughly tested. In addition, some of these features
+ are still in the experimental stage of development. To take advantage of many of these
+ features, you must be running HBase 0.98+ and using the HFile v3 file format.
+
+
+
+
+ Basic Server-Side Configuration
+
+ Enable HFile v3, by setting to 3 in
+ hbase-site.xml. This is the default for HBase 0.99 and
+ newer.
+ hfile.format.version3
- ]]>
- Every cell can have zero or more tags. Every tag has a type and the actual tag byte
- array. The types 0-31 are reserved for System tags. For example ‘1’ is
- reserved for ACL and ‘2’ is reserved for Visibility tags.
- The way rowkeys, column families, qualifiers and values are encoded using different
- Encoding Algos, similarly the tags can also be encoded. Tag encoding can be turned on per CF.
- Default is always turn ON. To turn on the tag encoding on the HFiles use
-
- Note that encoding of tags takes place only if the DataBlockEncoder is enabled for the
- CF.
- As we compress the WAL entries using Dictionary the tags present in the WAL can also be
- compressed using Dictionary. Every tag is compressed individually using WAL Dictionary. To
- turn ON tag compression in WAL dictionary enable the property
-
- hbase.regionserver.wal.tags.enablecompression
+ ]]>
+
+
+ Enable SASL and Kerberos authentication for RPC and ZooKeeper, as described in and .
+
+
+
+
+ Tags
+ Tags are a feature of HFile v3. A tag is a piece of metadata
+ which is part of a cell, separate from the key, value, and version. Tags are an
+ implementation detail which provides a foundation for other security-related features such
+ as cell-level ACLs and visibility labels. Tags are stored in the HFiles themselves. It is
+ possible that in the future, tags will be used to implement other HBase features. You don't
+ need to know a lot about tags in order to use the security features they enable.
+
+ Implementation Details
+ Every cell can have zero or more tags. Every tag has a type and the actual tag byte
+ array.
+ Just as row keys, column families, qualifiers and values can be encoded (see ), tags can also be encoded as well. You can enable
+ or disable tag encoding at the level of the column family, and it is enabled by default.
+ Use the HColumnDescriptor#setCompressionTags(boolean compressTags) method to
+ manage encoding settings on a column family. You also need to enable the DataBlockEncoder
+ for the column family, for encoding of tags to take effect.
+ You can enable compression of each tag in the WAL, if WAL compression is also enabled,
+ by setting the value of to
+ true in hbase-site.xml. Tag compression uses
+ dictionary encoding.
+ Tag compression is not supported when using WAL encryption.
+
+
+
+
+ Access Control Labels (ACLs)
+
+ How It Works
+ ACLs in HBase are based upon a user's membership in or exclusion from groups, and a
+ given group's permissions to access a given resource. ACLs are implemented as a
+ coprocessor called AccessController.
+ A Hadoop group mapper maps between entities in a directory such
+ as LDAP or Active Directory, and HBase users. HBase does not have groups. Any supported
+ Hadoop group mapper will work. Users are then granted specific permissions (Read, Write,
+ Execute, Create, Admin) against resources (global, namespaces, tables, cells, or
+ endpoints).
+
+ With Kerberos and Access Control enabled, client access to HBase is authenticated
+ and user data is private unless access has been explicitly granted.
+
+ HBase has a simpler security model than relational databases, especially in terms of
+ client operations. No distinction is made between an insert (new record) and update (of
+ existing record), for example, as both collapse down into a Put. Accordingly, the
+ important operations condense to four permissions: READ, WRITE, CREATE, and ADMIN.
+
+ Permissions can be granted in any of the following scopes, though CREATE and ADMIN
+ permissions are effective only at table, namespace, and global scopes.
+
+
+ Namespace
+
+
+
+ Read: User can read any table in the namespace.
+
+
+ Write: User can write to any table in the namespace.
+
+
+ Create: User can create tables in the namespace.
+
+
+ Admin: User can alter table attributes; add, alter, or drop column families;
+ and enable, disable, or drop the table. User can also trigger region
+ (re)assignments or relocation.
+
+
+
+
+
+ Table
+
+
+
+ Read: User can read from any column family in table
+
+
+ Write: User can write to any column family in table
+
+
+ Create: User can alter table attributes; add, alter, or drop column
+ families; and drop the table.
+
+
+ Admin: User can alter table attributes; add, alter, or drop column families;
+ and enable, disable, or drop the table. User can also trigger region
+ (re)assignments or relocation.
+
+
+
+
+
+ Column Family / Column Qualifier / Cell
+
+
+
+ Read: User can read at the specified scope.
+
+
+ Write: User can write at the specified scope.
+
+
+
+
+
+ Coprocessor Endpoint
+
+ Execute: the user can execute the coprocessor endpoint.
+
+
+
+ Global
+
+ Superusers are specified as a comma-separated list of users and groups, in the
+ option in hbase-site.xml,
+ has global scope in HBase. The superuser is equivalent to the
+ root user in a UNIX environment. As a minimum, the superuser
+ should include the principal used to run the HMaster process. Global admin
+ privileges, which are implicitly granted to the superuser, are required to create
+ namespaces, switch the balancer on and off, or take other actions with global
+ consequences. The superuser can also grant all permissions to all resources.
+
+
+
+
+
+ ACL Matrix
+ For more details on how ACLs map to specific HBase operations and tasks, see .
+ ACLs can be used together with Visibility Labels.
+ Cell-level ACLs are implemented using tags (see ). In
+ order to use cell-level ACLs, you must be using HFile v3 and HBase 0.98 or newer.
+
+ ACL Implementation Caveats
+
+ Files created by HBase are owned by the operating system user running the HBase
+ process. To interact with HBase files, you should use the API or bulk load
+ facility.
+
+
+ HBase does not model "roles" internally in HBase. Instead, group names can be
+ granted permissions. This allows external modeling of roles via group membership.
+ Groups are created and manipulated externally to HBase, via the Hadoop group mapping
+ service.
+
+
+
+
+ Server-Side Configuration
+
+
+ As a prerequisite, perform the steps in .
+
+ Install and configure the AccessController coprocessor, by setting the following
+ properties in hbase-site.xml. These properties take a list of
+ classes.
+ If you use the AccessController along with the VisibilityController, the
+ AccessController must come first in the list, because with both components active, the
+ VisibilityController will delegate access control on its system tables to the
+ AccessController.
+
+ hbase.coprocessor.region.classes
+ org.apache.hadoop.hbase.security.access.AccessController, org.apache.hadoop.hbase.security.token.TokenProvider
+
+
+ hbase.coprocessor.master.classes
+ org.apache.hadoop.hbase.security.access.AccessController
+
+
+ hbase.coprocessor.regionserver.classes
+ org.apache.hadoop.hbase.security.access.AccessController
+
+
+ hbase.security.exec.permission.checkstrue
- ]]>
- To add tags to every cell during Puts, the following apis are provided
-
+ ]]>
+ Optionally, you can enable transport security, by setting
+ to auth-conf. This requires
+ HBase 0.98.4 or newer.
+
+
+ Set up the Hadoop group mapper in the Hadoop namenode's
+ core-site.xml. This is a Hadoop file, not an HBase file.
+ Customize it to your site's needs. Following is an example.
+
+ hadoop.security.group.mapping
+ org.apache.hadoop.security.LdapGroupsMapping
+
- Some of the feature developed using tags are Cell level ACLs and Visibility labels. These
- are some features that use tags framework and allows users to gain better security features on
- cell level.
- For details, see:
-
- Access Control
- Visibility labels
-
-
+
+ hadoop.security.group.mapping.ldap.url
+ ldap://server
+
-
- Access Control
- Newer releases of Apache HBase (>= 0.92) support optional access control list (ACL-)
- based protection of resources on a column family and/or table basis.
- This describes how to set up Secure HBase for access control, with an example of granting
- and revoking user permission on table resources provided.
+
+ hadoop.security.group.mapping.ldap.bind.user
+ Administrator@example-ad.local
+
-
- Prerequisites
- You must configure HBase for secure or simple user access operation. Refer to the Secure Client Access to HBase or Simple User Access to HBase sections and
- complete all of the steps described there.
- For secure access, you must also configure ZooKeeper for secure operation. Changes to
- ACLs are synchronized throughout the cluster using ZooKeeper. Secure authentication to
- ZooKeeper must be enabled or otherwise it will be possible to subvert HBase access control
- via direct client access to ZooKeeper. Refer to the section on secure ZooKeeper
- configuration and complete all of the steps described there.
-
+
+ hadoop.security.group.mapping.ldap.bind.password
+ ****
+
+
+
+ hadoop.security.group.mapping.ldap.base
+ dc=example-ad,dc=local
+
+
+
+ hadoop.security.group.mapping.ldap.search.filter.user
+ (&(objectClass=user)(sAMAccountName={0}))
+
+
+
+ hadoop.security.group.mapping.ldap.search.filter.group
+ (objectClass=group)
+
+
+
+ hadoop.security.group.mapping.ldap.search.attr.member
+ member
+
+
+ hadoop.security.group.mapping.ldap.search.attr.group.name
+ cn
+]]>
+
+
+
+ Optionally, enable the early-out evaluation strategy. Prior to HBase 0.98.0, if a
+ user was not granted access to a column family, or at least a column qualifier, an
+ AccessDeniedException would be thrown. HBase 0.98.0 removed this exception in order to
+ allow cell-level exceptional grants. To restore the old behavior in HBase 0.98.x, set
+ to true in
+ hbase-site.xml.
+
+
+ Distribute your configuration and restart your cluster for changes to take
+ effect.
+
+
+ To test your configuration, log into HBase Shell as a given user and use the
+ whoami command to report the groups your user is part of. In this
+ example, the user is reported as being a member of the services
+ group.
+
+hbase> whoami
+service (auth:KERBEROS)
+ groups: services
+
+
+
+
+
+ Administration
+ Administration tasks can be performed from HBase Shell or via an API.
+
+ API Examples
+ Many of the API examples below are taken from source files
+ hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java
+ and
+ hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/SecureTestUtil.java.
+ Neither the examples, nor the source files they are taken from, are part of the
+ public HBase API, and are provided for illustration only. Refer to the
+ official API for usage instructions.
+
+
+
+ User and Group Administration
+ Users and groups are maintained external to HBase, in your directory.
+
+
+ Granting Access To A Namespace, Table, Column Family, or Cell
+ There are a few different types of syntax for grant statements. The first, and
+ most familiar, is as follows, with the table and column family being optional:
+ grant 'user', 'RWXCA', 'TABLE', 'CF', 'CQ'
+ Groups and users are granted access in the same way, but groups are prefixed with
+ an @ symbol. In the same way, tables and namespaces are specified
+ in the same way, but namespaces are prefixed with an @
+ symbol.
+ It is also possible to grant multiple permissions against the same resource in a
+ single statement, as in this example. The first sub-clause maps users to
+ ACLs and the second sub-clause specifies the resource.
+
+ HBase Shell support for granting and revoking access is for testing and verification
+ support, and should not be employed for production use because it won't apply the
+ permissions to cells that don't exist yet. The correct way to apply cell level
+ permissions is to do so in the application code when storing the values.
+
+
+ HBase Shell
+
+
+ Global:
+ hbase> grant '@admins', 'RWXCA'
+
+
+ Namespace:
+ hbase> grant 'service', 'RWXCA', '@test-NS'
+
+
+ Table:
+ hbase> grant 'service', 'RWXCA', 'user'
+
+
+ Column Family:
+ hbase> grant '@developers', 'RW', 'user', 'i'
+
+
+ Column Qualifier:
+ hbase> grant 'service, 'RW', 'user', 'i', 'foo'
+
+
+ Cell:
+ The syntax for granting cell ACLs uses the following syntax:
+ grant <table>, \
+ { '<user-or-group>' => \
+ '<permissions>', ... }, \
+ { <scanner-specification> }
+
+
+ <user-or-group> is the user or group
+ name, prefixed with @ in the case of a group.
+
+
+ <permissions> is a string containing
+ any or all of "RWXCA", though only R and W are meaningful at cell
+ scope.
+
+
+ <scanner-specification> is the scanner
+ specification syntax and conventions used by the 'scan' shell command. See
+ hbase-shell/src/main/ruby/shell/commands/scan.rb for
+ some examples of scanner specifications.
+
+
+ This example grants read access to the 'testuser' user and read/write access
+ to the 'developers' group, on cells in the 'pii' column which match the
+ filter.
+ hbase> grant 'user', \
+ { '@developers' => 'RW', 'testuser' => 'R' }, \
+ { COLUMNS => 'pii', FILTER => "(PrefixFilter ('test'))" }
+ The shell will run a scanner with the given criteria, rewrite the found
+ cells with new ACLs, and store them back to their exact coordinates.
+
+
+ In addition, the alter command has been extended to allow for a
+ change in table ownership:
+ hbase> alter 'tablename', {OWNER => 'username|@group'}
+
+
+ API
+ The following example shows how to grant access at the
+ table level.
+ () {
+ @Override
+ public Void call() throws Exception {
+ HTable acl = new HTable(util.getConfiguration(), AccessControlLists.ACL_TABLE_NAME);
+ try {
+ BlockingRpcChannel service = acl.coprocessorService(HConstants.EMPTY_START_ROW);
+ AccessControlService.BlockingInterface protocol =
+ AccessControlService.newBlockingStub(service);
+ ProtobufUtil.grant(protocol, user, table, family, qualifier, actions);
+ } finally {
+ acl.close();
+ }
+ return null;
+ }
+ });
+} ]]>
+
+ You can also use the Mutation.setACL method:
+ perms)
+ ]]>
+
+ This example provides read permission to a user called
+ user1:
+
+
+
+
+ Revoking Access Control From a Namespace, Table, Column Family, or Cell
+ The revoke command and API are twins of the grant command and
+ API, and the syntax is exactly the same. The only exception is that you cannot revoke
+ permissions at the cell level. You can only revoke access that has previously been
+ granted, and a revoke statement is not the same thing as explicit
+ denial to a resource.
+
+ HBase Shell support for granting and revoking access is for testing and verification
+ support, and should not be employed for production use because it won't apply the
+ permissions to cells that don't exist yet. The correct way to apply cell-level
+ permissions is to do so in the application code when storing the values.
+
+
+ Revoking Access To a Table
+
+() {
+ @Override
+ public Void call() throws Exception {
+ HTable acl = new HTable(util.getConfiguration(), AccessControlLists.ACL_TABLE_NAME);
+ try {
+ BlockingRpcChannel service = acl.coprocessorService(HConstants.EMPTY_START_ROW);
+ AccessControlService.BlockingInterface protocol =
+ AccessControlService.newBlockingStub(service);
+ ProtobufUtil.revoke(protocol, user, table, family, qualifier, actions);
+ } finally {
+ acl.close();
+ }
+ return null;
+ }
+ });
+} ]]>
+
+
+
+
+ Showing a User's Effective Permissions
+
+ HBase Shell
+ hbase> user_permission 'user'
+ hbase> user_permission '.*'
+ hbase> user_permission JAVA_REGEX
+
+
+ API
+ ) {
+ List> results = (List>) obj;
+ if (results != null && results.isEmpty()) {
+ fail("Empty non null results from action for user '" + user.getShortName() + "'");
+ }
+ assertEquals(count, results.size());
+ }
+ } catch (AccessDeniedException ade) {
+ fail("Expected action to pass for user '" + user.getShortName() + "' but was denied");
+ }
+}]]>
+
+
+
+ Cell-First Strategy
+ By default, ACLs are evaluated from least granular to most granular, and when an
+ ACL is reached that grants permission, evaluation stops. If you use cell ACLs and you
+ want the cell ACL to be evaluated first, you can use the method
+ Mutation.setACLStrategy(boolean cellFirstStrategy). options.
+
+
+
+
+
- Overview
- With Secure RPC and Access Control enabled, client access to HBase is authenticated and
- user data is private unless access has been explicitly granted. Access to data can be
- granted at a table or per column family basis.
- However, the following items have been left out of the initial implementation for
- simplicity:
-
-
- Row-level or per value (cell): Using Tags in HFile V3
-
-
- Push down of file ownership to HDFS: HBase is not designed for the case where files
- may have different permissions than the HBase system principal. Pushing file ownership
- down into HDFS would necessitate changes to core code. Also, while HDFS file ownership
- would make applying quotas easy, and possibly make bulk imports more straightforward, it
- is not clear that it would offer a more secure setup.
-
-
- HBase managed "roles" as collections of permissions: We will not model "roles"
- internally in HBase to begin with. We instead allow group names to be granted
- permissions, which allows external modeling of roles via group membership. Groups are
- created and manipulated externally to HBase, via the Hadoop group mapping
- service.
-
-
- Access control mechanisms are mature and fairly standardized in the relational database
- world. The HBase implementation approximates current convention, but HBase has a simpler
- feature set than relational databases, especially in terms of client operations. We don't
- distinguish between an insert (new record) and update (of existing record), for example, as
- both collapse down into a Put. Accordingly, the important operations condense to four
- permissions: READ, WRITE, CREATE, and ADMIN.
+ Visibility Labels
+ Visibility labels control can be used to only permit users or principals associated with
+ a given label to read or access cells with that label. For instance, you might label a cell
+ top-secret, and only grant access to that label to the
+ managers group. Visibility labels are implemented using Tags, which are
+ a feature of HFile v3, and allow you to store metadata on a per-cell basis. A label is a
+ string, and labels can be combined into expressions by using logical operators (&, |, or
+ !), and using parentheses for grouping. The | operator is not an
+ exclusive OR. HBase does not do any kind of validation of expressions beyond basic
+ well-formedness. Visibility labels have no meaning on their own. They may be used to denote
+ sensitivity level, privilege level, or any other arbitrary semantic meaning.
+ If a user's labels do not match a cell's label or expression, the user is
+ denied access to the cell.
+ In HBase 0.98.6 and newer, UTF-8 encoding is supported for visibility labels and
+ expressions. When creating labels using the addLabels() method and and passing
+ labels in Authorizations via Scan or Get, labels can contain UTF-8 characters, as well as
+ the characters containing !,&, | with normal Java notations, without needing any
+ escaping method. However, when you pass pass a CellVisibility expression via a Mutation, you
+ must enclose the expression with the CellVisibility.quote() method if you use
+ UTF-8 characters and characters !,& and |, which are otherwise considered as expression
+ operators. See TestExpressionParser and the source file
+ hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestScan.java.
+
+ A user adds visibility expressions to a cell during a Put operation. The user does not
+ need to access to a label in order to label cells with it, by default. This behavior is
+ controlled by the configuration option
+ , If you set this option to
+ true, the labels the user is modifying as part of the mutation must be
+ associated with the user, or the mutation will fail. Whether a user is authorized to read a
+ labelled cell is determined during a Get or Scan, and results which the user is not allowed
+ to read are filtered out. This incurs the same I/O penalty as if the results were returned,
+ but reduces load on the network.
+ Visibility labels can also be specified during Delete operations. For details about
+ visibility labels and Deletes, see HBASE-10885.
+ The user's effective label set is built in the RPC context when a request is first
+ received by the RegionServer. The way that users are associated with labels is pluggable.
+ The default plugin passes through labels specified in Authorizations added to the Get or
+ Scan and checks those against the calling user's authenticated labels list. When the client
+ passes labels for which the user is not authenticated, the default plugin drops them. You
+ can pass a subset of user authenticated labels via the Get#setAuthorizations(new
+ Authorizations(String,...)) and Scan#setAuthorizations(new
+ Authorizations(String,...)); APIs.
+ Visibility label access checking is performed by the VisibilityController coprocessor.
+ You can use interface VisibilityLabelService to provide a custom implementation
+ and/or control the way that visibility labels are stored with cells. See the source file
+ hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithCustomVisLabService.java
+ for one example.
+
+ Visibility labels can be used in conjunction with ACLs.
- Operation To Permission Mapping
-
-
-
+ Examples of Visibility Expressions
+
- Permission
- Operation
+ Expression
+ Interpretation
-
-
- Read
- Get
-
-
-
- Exists
-
-
-
- Scan
-
-
-
- Write
- Put
-
-
-
- Delete
-
-
-
- Lock/UnlockRow
-
-
-
- IncrementColumnValue
-
-
-
- CheckAndDelete/Put
-
-
-
- Create
- Create
-
-
-
- Alter
-
-
-
- Drop
-
-
-
- Bulk Load
-
-
- Admin
- Enable/Disable
+ fulltime
+ Only allow accesss to users associated with the fulltime
+ label.
-
- Snapshot/Restore/Clone
+ !public
+ Allow access to users not associated with the public
+ label.
-
- Split
-
-
-
- Flush
-
-
-
- Compact
-
-
-
- Major Compact
-
-
-
- Grant
-
-
-
- Revoke
-
-
-
- Shutdown
+ ( secret | topsecret ) & !probationary
+ The user must be associated with either the secret and/or
+ topsecret label, and not be associated with the
+ probationary
- Permissions can be granted in any of the following scopes, though CREATE and ADMIN
- permissions are effective only at table scope.
+
+ Server-Side Configuration
+
+
+ As a prerequisite, perform the steps in .
+
+ Install and configure the VisibilityController coprocessor, by setting the
+ following properties in hbase-site.xml. These properties take a
+ list of classes
+ If you use the AccessController and VisibilityController coprocessors together,
+ the AccessController must come first in the list, because with both components
+ active, the VisibilityController will delegate access control on its system tables
+ to the AccessController.
+
+ hbase.coprocessor.region.classes
+ org.apache.hadoop.hbase.security.visibility.VisibilityController
+
+
+ hbase.coprocessor.master.classes
+ org.apache.hadoop.hbase.security.visibility.VisibilityController
+
+ ]]>
+
+
+ Adjust Configuration
+ By default, users can label cells with any label, including labels they are not
+ associated with, which means that a user can Put data that he cannot read. For
+ example, a user could label a cell with the (hypothetical) 'topsecret' label even if
+ the user is not associated with that label. If you only want users to be able to label
+ cells with labels they are associated with, set
+ hbase.security.visibility.mutations.checkauths to
+ true. In that case, the mutation will fail if it makes use of
+ labels the user is not associated with.
+
+
+ Distribute your configuration and restart your cluster for changes to take
+ effect.
+
+
+
+
+ Administration
+ Administration tasks can be performed using the HBase Shell or the Admin API. For
+ defining the list of visibility labels and associating labels with users, the
+ HBase Shell is probably simpler.
+
+ API Examples
+ Many of the Java API examples in this section are taken from the source file
+ hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabels.java.
+ Refer to that file or the API documentation for more context.
+ Neither these examples example, nor the source file they were taken from, are part of the
+ public HBase API, and are provided for illustration only. Refer to the official API
+ for usage instructions.
+
+
+
+ Define the List of Visibility Labels
+
+ HBase Shell
+ hbase< add_labels [ 'admin', 'service', 'developer', 'test' ]
+
+
+ Java API
+ action =
+ new PrivilegedExceptionAction() {
+ public VisibilityLabelsResponse run() throws Exception {
+ String[] labels = { SECRET, TOPSECRET, CONFIDENTIAL, PUBLIC, PRIVATE, COPYRIGHT, ACCENT,
+ UNICODE_VIS_TAG, UC1, UC2 };
+ try {
+ VisibilityClient.addLabels(conf, labels);
+ } catch (Throwable t) {
+ throw new IOException(t);
+ }
+ return null;
+ }
+ };
+ SUPERUSER.runAs(action);
+}
+ ]]>
+
+
+
+ Associate Labels with Users
+
+ HBase Shell
+ hbase< set_auths 'service', [ 'service' ]
+ hbase< set_auths 'testuser', [ 'test' ]
+ hbase< set_auths 'qa', [ 'test', 'developer' ]
+
+
+ Java API
+ action = new PrivilegedExceptionAction() {
+ public Void run() throws Exception {
+ String[] auths = { SECRET, CONFIDENTIAL };
+ try {
+ VisibilityClient.setAuths(conf, auths, user);
+ } catch (Throwable e) {
+ }
+ return null;
+ }
+ ...
+ ]]>
+
+
+
+ Clear Labels From Users
+
+ HBase Shell
+ hbase< clear_auths 'service', [ 'service' ]
+ hbase< clear_auths 'testuser', [ 'test' ]
+ hbase< clear_auths 'qa', [ 'test', 'developer' ]
+
+
+ Java API
+
+
+
+
+ Apply a Label or Expression to a Cell
+ The label is only applied when data is written. The label is associated with a
+ given version of the cell.
+
+ HBase Shell
+ hbase< set_visibility 'user', 'admin|service|developer', \
+ { COLUMNS => 'i' }
+ hbase< set_visibility 'user', 'admin|service', \
+ { COLUMNS => ' pii' }
+ hbase< COLUMNS => [ 'i', 'pii' ], \
+ FILTER => "(PrefixFilter ('test'))" }
+
+
+ HBase Shell support for applying labels or permissions to cells is for testing
+ and verification support, and should not be employed for production use because it
+ won't apply the labels to cells that don't exist yet. The correct way to apply cell
+ level labels is to do so in the application code when storing the values.
+
+
+ Java API
+ puts = new ArrayList();
+ for (String labelExp : labelExps) {
+ Put put = new Put(Bytes.toBytes("row" + i));
+ put.add(fam, qual, HConstants.LATEST_TIMESTAMP, value);
+ put.setCellVisibility(new CellVisibility(labelExp));
+ puts.add(put);
+ i++;
+ }
+ table.put(puts);
+ } finally {
+ if (table != null) {
+ table.flushCommits();
+ }
+ }
+ ]]>
+
+
+
+
+
+ Implementing Your Own Visibility Label Algorithm
+ Interpreting the labels authenticated for a given get/scan request is a pluggable
+ algorithm. You can specify a custom plugin by using the property
+ hbase.regionserver.scan.visibility.label.generator.class. The default
+ implementation class is
+ org.apache.hadoop.hbase.security.visibility.DefaultScanLabelGenerator. You
+ can also configure a set of ScanLabelGenerators to be used by the system, as
+ a comma-separated list.
+
+
-
-
- Table
-
-
-
- Read: User can read from any column family in table
-
-
- Write: User can write to any column family in table
-
-
- Create: User can alter table attributes; add, alter, or drop column families;
- and drop the table.
-
-
- Admin: User can alter table attributes; add, alter, or drop column families;
- and enable, disable, or drop the table. User can also trigger region
- (re)assignments or relocation.
-
-
-
-
-
- Column Family
-
-
-
- Read: User can read from the column family
-
-
- Write: User can write to the column family
-
-
-
-
-
+
+ Transparent Encryption of Data At Rest
+ HBase provides a mechanism for protecting your data at rest, in HFiles and the WAL, which
+ reside within HDFS or another distributed filesystem. A two-tier architecture is used for
+ flexible and non-intrusive key rotation. "Transparent" means that no implementation changes
+ are needed at the client side. When data is written, it is encrypted. When it is read, it is
+ decrypted on demand.
+
+ How It Works
+ The administrator provisions a master key for the cluster, which is stored in a key
+ provider which is accessible to every trusted HBase process, including the HMaster,
+ RegionServers, and clients (such as HBase Shell) which reside on administrative
+ workstations. The default key provider is integrated with the Java KeyStore API and any
+ key management systems with support for it. Other custom key provider implementations are
+ possible. The key retrieval mechanism is configured in the
+ hbase-site.xml configuration file. The master key may be stored on
+ the cluster servers, protected by a secure KeyStore file, or on an external keyserver, or
+ in a hardware security module. This master key is resolved as needed by HBase processes
+ through the configured key provider.
+ Next, encryption use can be specified in the schema, per Column Family, by creating
+ or modifying a column descriptor to include two additional attributes: the name of the
+ encryption algorithm to use (currently only "AES" is supported), and, optionally, a data
+ key wrapped (encrypted) with the cluster master key. If a data key is not explictly
+ configured for a ColumnFamily, HBase will create a random data key per HFile. This
+ provides an incremental improvement in security over the alternative. Unless you need to
+ supply an explicit data key, such as in a case where you are generating encrypted HFiles
+ for bulk import with a given data key, only specify the encryption algorithm in the
+ ColumnFamily schema metadata and let HBase create data keys on demand. Per Column Family
+ keys facilitate low impact incremental key rotation and reduce the scope of any external
+ leak of key material. The wrapped data key is stored in the ColumnFamily schema metadata,
+ and in each HFile for the Column Family, encrypted with the cluster master key. After the
+ Column Family is configured for encryption, any new HFiles will be written encrypted. To
+ ensure encryption of all HFiles, trigger a major compaction after enabling this
+ feature.
+ When the HFile is opened, the data key is extracted from the HFile, decrypted with the
+ cluster master key, and used for decryption of the remainder of the HFile. The HFile will
+ be unreadable if the master key is not available. If a remote user somehow acquires access
+ to the HFile data because of some lapse in HDFS permissions, or from inappropriately
+ discarded media, it will not be possible to decrypt either the data key or the file
+ data.
+ It is also possible to encrypt the WAL. Even though WALs are transient, it is
+ necessary to encrypt the WALEdits to avoid circumventing HFile protections for encrypted
+ column families, in the event that the underlying filesystem is compromised. When WAL
+ encryption is enabled, all WALs are encrypted, regardless of whether the relevant HFiles
+ are encrypted.
+
+
+ Server-Side Configuration
+ This procedure assumes you are using the default Java keystore implementation. If you
+ are using a custom implementation, check its documentation and adjust accordingly.
+
+
+ Create a secret key of appropriate length for AES encryption, using the
+ keytool utility.
+ $ keytool -keystore /path/to/hbase/conf/hbase.jks \
+ -storetype jceks -storepass <password> \
+ -genseckey -keyalg AES -keysize 128 \
+ -alias <alias>
+ Replace <password> with the password for the keystore file and <alias>
+ with the username of the HBase service account, or an arbitrary string. If you use an
+ arbitrary string, you will need to configure HBase to use it, and that is covered
+ below. Specify a keysize that is appropriate. Do not specify a separate password for
+ the key, but press Return when prompted.
+
+
+ Set appropriate permissions on the keyfile and distribute ite to all the HBase
+ servers.
+ The previous command created a file called hbase.jks in the
+ HBase conf/ directory. Set the permissions and ownership on this
+ file such that only the HBase service account user can read the file, and use a
+ secure mechanism, such as SSH, to distribute the key to all
+ HBase servers.
+
+
+ Configure the HBase daemons.
+ Set the following properties to configure HBase daemons to use a key provider
+ backed by the KeyStore file or retrieving the cluster master key.
+
+ hbase.crypto.keyprovider
+ org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider
+
+
+ hbase.crypto.keyprovider.parameters
+ jceks:///path/to/hbase/conf/hbase.jks?password=
+
+ ]]>
+ By default, the HBase service account name will be used to resolve the cluster
+ master key. However, you can store it with an arbitrary alias (in the
+ keytool command). In that case, set the following property to the
+ alias you used.
+
+ hbase.crypto.master.key.name
+ my-alias
+]]>
+
+ You also need to be sure your HFiles use HFile v3, in order to use transparent
+ encryption. Set the following property in your hbase-site.xml
+ file.
+
+ hfile.format.version
+ 3
+]]>
+
+ Optionally, you can use a different cipher provider, either a Java Cryptography
+ Encryption (JCE) algorithm provider or a custom HBase cipher implementation.
+
+
+ JCE:
+
+
+ Install a signed JCE provider (supporting “AES/CTR/NoPadding” mode with
+ 128 bit keys)
+
+
+ Add it with highest preference to the JCE site configuration file
+ $JAVA_HOME/lib/security/java.security.
+
+
+ Update and
+ options in
+ hbase-site.xml.
+
+
+
+
+ Custom HBase Cipher:
+
+
+ Implement
+ org.apache.hadoop.hbase.io.crypto.CipherProvider.
+
+
+ Add the implementation to the server classpath.
+
+
+ Update in
+ hbase-site.xml.
+
+
+
+
+
+
+ Configure WAL encryption.
+ Configure WAL encryption in every RegionServer's
+ hbase-site.xml, by setting the following properties. You can
+ include these in the HMaster's hbase-site.xml as well, but the
+ HMaster does not have a WAL and will not use them.
+
+ hbase.regionserver.hlog.reader.impl
+ org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader
+
+
+ hbase.regionserver.hlog.writer.impl
+ org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter
+
+
+ hbase.regionserver.wal.encryption
+ true
+
+ ]]>
+
+
+ Configure permissions on the hbase-site.xml file.
+ Because the keystore password is stored in the hbase-site.xml, you need to ensure
+ that only the HBase user can read the hbase-site.xml file, using
+ file ownership and permissions. Also, whenever you distribute the configuration to
+ your cluster nodes, be sure to use a secure means of transport, such as
+ ssh.
+
+
+ Distribute the new configuration and restart your cluster.
+ Distribute the new configuration file using a secure mechanism, and restart your
+ cluster.
+
+
+
+
+ Administration
+ Administerative tasks can be performed in HBase Shell or the Java API.
+
+ Java API
+ Java API examples in this section are taken from the source file
+ hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckEncryption.java.
+ .
+ Neither these examples, nor the source files they are taken from, are part of the
+ public HBase API, and are provided for illustration only. Refer to the official API
+ for usage instructions.
+
+
+
+ Enable Encryption on a Column Family
+
+ To enable encryption on a column family, you can either use HBase Shell or the
+ Java API. After enablin encryption, trigger a major compaction. When the major
+ compaction completes, the HFiles will be encrypted.
+
+ HBase Shell
+
+hbase> disable 'mytable'
+hbase> alter 'mytable', 'mycf', {ENCRYPTION => AES}
+hbase> enable 'mytable'
+
+
+
+ Java API
+ You can use the HBaseAdmin#modifyColumn API to modify the
+ ENCRYPTION attribute on a Column Family. Additionally, you
+ can specify the specific key to use as the wrapper, by setting the
+ ENCRYPTION_KEY attribute. This is only possible via the
+ API, and not the HBase Shell.
+ This example shows how to programmatically set the transparent encryption both
+ in the server configuration and at the column family, as part of a test which uses
+ the Minicluster configuration
+
+@Before
+public void setUp() throws Exception {
+ conf = TEST_UTIL.getConfiguration();
+ conf.setInt("hfile.format.version", 3);
+ conf.set(HConstants.CRYPTO_KEYPROVIDER_CONF_KEY, KeyProviderForTesting.class.getName());
+ conf.set(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, "hbase");
+
+ // Create the test encryption key
+ SecureRandom rng = new SecureRandom();
+ byte[] keyBytes = new byte[AES.KEY_LENGTH];
+ rng.nextBytes(keyBytes);
+ cfKey = new SecretKeySpec(keyBytes, "AES");
- There is also an implicit global scope for the superuser.
- The superuser is a principal, specified in the HBase site configuration file, that has
- equivalent access to HBase as the 'root' user would on a UNIX derived system. Normally this
- is the principal that the HBase processes themselves authenticate as. Although future
- versions of HBase Access Control may support multiple superusers, the superuser privilege
- will always include the principal used to run the HMaster process. Only the superuser is
- allowed to create tables, switch the balancer on or off, or take other actions with global
- consequence. Furthermore, the superuser has an implicit grant of all permissions to all
- resources.
- Tables have a new metadata attribute: OWNER, the user principal who owns the table. By
- default this will be set to the user principal who creates the table, though it may be
- changed at table creation time or during an alter operation by setting or changing the OWNER
- table attribute. Only a single user principal can own a table at a given time. A table owner
- will have all permissions over a given table.
+ // Start the minicluster
+ TEST_UTIL.startMiniCluster(3);
+
+ // Create the table
+ htd = new HTableDescriptor(TableName.valueOf("default", "TestHBaseFsckEncryption"));
+ HColumnDescriptor hcd = new HColumnDescriptor("cf");
+ hcd.setEncryptionType("AES");
+ hcd.setEncryptionKey(EncryptionUtil.wrapKey(conf,
+ conf.get(HConstants.CRYPTO_MASTERKEY_NAME_CONF_KEY, User.getCurrent().getShortName()),
+ cfKey));
+ htd.addFamily(hcd);
+ TEST_UTIL.getHBaseAdmin().createTable(htd);
+ TEST_UTIL.waitTableAvailable(htd.getName(), 5000);
+}
+
+
+
+
+
+ Rotate the Data Key
+
+ To rotate the data key, first change the ColumnFamily key in the column
+ descriptor, then trigger a major compaction. When compaction is complete, all HFiles
+ will be re-encrypted using the new data key. Until the compaction completes, the
+ old HFiles will still be readable using the old key.
+
+
+
+ Rotate the Master Key
+
+ To rotate the master key, first generate and distribute the new key. Then update
+ the KeyStore to contain a new master key, and keep the old master key in the
+ KeyStore using a different alias. Next, configure fallback to the old master key in
+ the hbase-site.xml file.
+
+ hbase.crypto.master.alternate.key.name
+ hbase.old
+
+ ]]>
+ Rolling restart your cluster for this change to take effect. Trigger a major
+ compaction on each table. At the end of the major compaction, all HFiles will be
+ re-encrypted with data keys wrapped by the new cluster key. At this point, you can
+ remove the old master key from the KeyStore, remove the configuration for the
+ fallback master key from the hbase-site.xml, and perform a
+ second rolling restart at some point. This second rolling restart is not
+ time-sensitive.
+
+
+
+
+
+
+
+
+
+
-
- Access Control Matrix
- The following matrix shows the minimum permission set required to perform operations in
- HBase. Before using the table, read through the information about how to interpret it.
-
- Interpreting the ACL Matrix Table
- The following conventions are used in the ACL Matrix table:
-
- Scopes
-
- Permissions are evaluated starting at the widest scope and working to the
- narrowest scope. A scope corresponds to a level of the data model. From broadest to
- narrowest, the scopes are as follows::
-
- Global
- Namespace (NS)
- Table
- Column Qualifier (CF)
- Column Family (CQ)
- Cell
-
- For instance, a permission granted at table level dominates any grants done at the
- ColumnFamily, ColumnQualifier, or cell level. The user can do what that grant implies
- at any location in the table. A permission granted at global scope dominates all: the
- user is always allowed to take that action everywhere.
-
-
-
- Permissions
-
- Possible permissions include the following:
-
- Superuser - a special user that belongs to group "supergroup" and has
- unlimited access
- Admin (A)
- Create (C)
- Write (W)
- Read (R)
- Execute (X)
-
-
-
-
- For the most part, permissions work in an expected way, with the following caveats:
+
+ Secure Bulk Load
+ Bulk loading in secure mode is a bit more involved than normal setup, since the client
+ has to transfer the ownership of the files generated from the mapreduce job to HBase. Secure
+ bulk loading is implemented by a coprocessor, named SecureBulkLoadEndpoint, which uses a staging directory configured by the
+ configuration property hbase.bulkload.staging.dir, which defaults to
+ /tmp/hbase-staging/.
+ Secure Bulk Load Algorithm
- Having Write permission does not imply Read permission. It is possible and sometimes
- desirable for a user to be able to write data that same user cannot read. One such example
- is a log-writing process.
-
-
- Admin is a superset of Create, so a user with Admin permissions does not also need
- Create permissions to perform an action such as creating a table.
-
-
- The hbase:meta table is readable by every user, regardless
- of the user's other grants or restrictions. This is a requirement for HBase to
- function correctly.
+ One time only, create a staging directory which is world traversable and owned by
+ the user which runs HBase (mode 711, or rwx--x--x). A listing of this
+ directory will look similar to the following:.
+ $ ls -ld /tmp/hbase-staging
+drwx--x--x 2 hbase hbase 68 3 Sep 14:54 /tmp/hbase-staging
+
- Users with Create or Admin permissions are granted Write permission on meta regions,
- so the table operations they are allowed to perform can complete, even if technically
- the bits can be granted separately in any possible combination.
+ A user writes out data to a secure output directory owned by that user. For example,
+ /user/foo/data.
- CheckAndPut and CheckAndDelete operations will fail if the user does not have both
- Write and Read permission.
+ Internally, HBase creates a secret staging directory which is globally
+ readable/writable (-rwxrwxrwx, 777). For example,
+ /tmp/hbase-staging/averylongandrandomdirectoryname. The name and
+ location of this directory is not exposed to the user. HBase manages creation and
+ deletion of this directory.
- Increment and Append operations do not require Read access.
+ The user makes the data world readable and writable, moves it into the random
+ staging directory, then calls the bulkLoadHFiles() method.
-
- The following table is sorted by the interface that provides each operation. In case the
- table goes out of date, the unit tests which check for accuracy of permissions can be found
- in
- hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java,
- and the access controls themselves can be examined in
- hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java.
-
-
-
-
-
- Server-side Configuration for Access Control
- Enable the AccessController coprocessor in the cluster configuration and restart HBase.
- The restart can be a rolling one. Complete the restart of all Master and RegionServer
- processes before setting up ACLs.
- To enable the AccessController, modify the hbase-site.xml file on every
- server machine in the cluster to look like:
+ Like delegation tokens, the strength of the security lies in the length and randomness
+ of the secret directory.
+ To enable secure bulk load, add the following properties to
+ hbase-site.xml.
- hbase.coprocessor.master.classes
- org.apache.hadoop.hbase.security.access.AccessController
+ hbase.bulkload.staging.dir
+ /tmp/hbase-staging
-hbase.coprocessor.region.classes
+ hbase.coprocessor.region.classesorg.apache.hadoop.hbase.security.token.TokenProvider,
- org.apache.hadoop.hbase.security.access.AccessController
+ org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint
]]>
+
+
-
- Cell level Access Control using Tags
- Prior to HBase 0.98 access control was restricted to table and column family level.
- Thanks to tags feature in 0.98 that allows Access control on a cell level. The existing
- Access Controller coprocessor helps in achieving cell level access control also. For details
- on configuring it refer to Access Control section.
- The ACLs can be specified for every mutation using the APIs
- perms)
- ]]>
- For example, to provide read permission to an user ‘user1’ then
-
- Generally the ACL applied on the table and CF takes precedence over Cell level ACL. In
- order to make the cell level ACL to take precedence use the following API,
-
- Please note that inorder to use this feature, HFile V3 version should be turned on.
+
+ Security Configuration Example
+ This configuration example includes support for HFile v3, ACLs, Visibility Labels, and
+ transparent encryption of data at rest and the WAL. All options have been discussed separately
+ in the sections above.
+
+ Example Security Settings in hbase-site.xmlhfile.format.version3
- ]]>
- Note that deletes with ACLs do not have any effect. To keep things simple the ACLs
- applied on the current Put does not change the ACL of any previous Put in the sense that the
- ACL on the current put does not affect older versions of Put for the same row.
-
-
- Shell Enhancements for Access Control
- The HBase shell has been extended to provide simple commands for editing and updating
- user permissions. The following commands have been added for access control list management:
-
-
-
- Grant
- [
[ [ ] ] ]
- ]]>
-
-
-
- <user|@group> is user or group (start with character '@'), Groups are
- created and manipulated via the Hadoop group mapping service.
-
- <permissions> is zero or more letters from the set "RWCA": READ('R'),
- WRITE('W'), CREATE('C'), ADMIN('A').
- Note: Grants and revocations of individual permissions on a resource are both
- accomplished using the grant command. A separate revoke command is
- also provided by the shell, but this is for fast revocation of all of a user's access rights
- to a given resource only.
-
- Revoke
-
- [
[ [ ] ] ]
- ]]>
-
-
- Alter
-
- The alter command has been extended to allow ownership
- assignment:
- 'username|@group'}
-]]>
-
-
-
- User Permission
-
- The user_permission command shows all access permissions for the current
- user for a given table:
-
- ]]>
-
-
-
-
-
-
-
- Secure Bulk Load
- Bulk loading in secure mode is a bit more involved than normal setup, since the client
- has to transfer the ownership of the files generated from the mapreduce job to HBase. Secure
- bulk loading is implemented by a coprocessor, named SecureBulkLoadEndpoint.
- SecureBulkLoadEndpoint uses a staging directory "hbase.bulkload.staging.dir",
- which defaults to /tmp/hbase-staging/. The algorithm is as follows.
-
-
- Create an hbase owned staging directory which is world traversable (-rwx--x--x,
- 711) /tmp/hbase-staging.
-
-
- A user writes out data to his secure output directory: /user/foo/data
-
-
- A call is made to hbase to create a secret staging directory which is globally
- readable/writable (-rwxrwxrwx, 777):
- /tmp/hbase-staging/averylongandrandomdirectoryname
-
-
- The user makes the data world readable and writable, then moves it into the random
- staging directory, then calls bulkLoadHFiles()
-
-
- Like delegation tokens the strength of the security lies in the length and randomness of
- the secret directory.
-
- You have to enable the secure bulk load to work properly. You can modify the
- hbase-site.xml file on every server machine in the cluster and add the
- SecureBulkLoadEndpoint class to the list of regionserver coprocessors:
-
- hbase.bulkload.staging.dir
- /tmp/hbase-staging
+ hbase.superuser
+ hbase, admin
+
hbase.coprocessor.region.classes
- org.apache.hadoop.hbase.security.token.TokenProvider,
- org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint
+ org.apache.hadoop.hbase.security.access.AccessController,
+ org.apache.hadoop.hbase.security.visibility.VisibilityController,
+ org.apache.hadoop.hbase.security.token.TokenProvider
- ]]>
-
-
-
-
- Visibility Labels
- This feature provides cell level security with labeled visibility for the cells. Cells
- can be associated with a visibility expression. The visibility expression can contain labels
- joined with logical expressions '&', '|' and '!'. Also using
- '(', ')' one can specify the precedence order. For example, consider the label
- set { confidential, secret, topsecret, probationary }, where the first three are sensitivity
- classifications and the last describes if an employee is probationary or not. If a cell is
- stored with this visibility expression: ( secret | topsecret ) & !probationary
- Then any user associated with the secret or topsecret label will be able to view the
- cell, as long as the user is not also associated with the probationary label. Furthermore, any
- user only associated with the confidential label, whether probationary or not, will not see
- the cell or even know of its existence.
- Visibility expressions like the above can be added when storing or mutating a cell using
- the API,
- Mutation#setCellVisibility(new CellVisibility(String labelExpession));
- Where the labelExpression could be '( secret | topsecret ) & !probationary'
- We build the user's label set in the RPC context when a request is first received by
- the HBase RegionServer. How users are associated with labels is pluggable. The default plugin
- passes through labels specified in Authorizations added to the Get or Scan and checks those
- against the calling user's authenticated labels list. When client passes some labels for
- which the user is not authenticated, this default algorithm will drop those. One can pass a
- subset of user authenticated labels via the Scan/Get authorizations.
- Get#setAuthorizations(new Authorizations(String,...));
- Scan#setAuthorizations(new Authorizations(String,...));
-
-
- Visibility Label Administration
- There are new client side Java APIs and shell commands for performing visibility labels
- administrative actions. Only the HBase super user is authorized to perform these operations.
-
-
- Adding Labels
- A set of labels can be added to the system either by using the Java API
- VisibilityClient#addLabels(Configuration conf, final String[]
- labels)
- Or by using the shell command
- add_labels [label1, label2]
- Valid label can include alphanumeric characters and characters '-',
- '_', ':', '.' and '/'
-
-
-
- User Label Association
- A set of labels can be associated with a user by using the API
- VisibilityClient#setAuths(Configuration conf, final String[] auths, final String
- user)
- Or by using the shell command
- set_auths user,[label1, label2].
- Labels can be disassociated from a user using API
- VisibilityClient#clearAuths(Configuration conf, final String[] auths, final
- String user)
- Or by using shell command
- clear_auths user,[label1, label2]
- One can use the API VisibilityClient#getAuths(Configuration conf, final String
- user) or get_auths shell command to get the list of labels
- associated for a given user. The labels and user auths information will be stored in the
- system table "labels".
-
-
-
-
- Server Side Configuration
- HBase stores cell level labels as cell tags. HFile version 3 adds the cell tags
- support. Be sure to use HFile version 3 by setting this property in every server site
- configuration file:
-
- hfile.format.version
- 3
-
- ]]>
- You will also need to make sure the VisibilityController coprocessor is active on every
- table to protect by adding it to the list of system coprocessors in the server site
- configuration files:
- hbase.coprocessor.master.classes
-org.apache.hadoop.hbase.security.visibility.VisibilityController
+ org.apache.hadoop.hbase.security.access.AccessController,
+ org.apache.hadoop.hbase.security.visibility.VisibilityController
- hbase.coprocessor.region.classes
-org.apache.hadoop.hbase.security.visibility.VisibilityController
-
- ]]>
- As said above, finding out labels authenticated for a given get/scan request is a
- pluggable algorithm. A custom implementation can be plugged in using the property
- hbase.regionserver.scan.visibility.label.generator.class. The default
- implementation class is
- org.apache.hadoop.hbase.security.visibility.DefaultScanLabelGenerator. One
- can configure a set of ScanLabelGenerators to be used by the system. For this, a comma
- separated set of implementation class names to be configured.
-
- Visibility Labels and Replication
- By default, visibility labels are lost on replication. To change this behavior, see
- .
-
-
-
-
-
- Transparent Server Side Encryption
- This feature provides transparent encryption for protecting HFile and WAL data at rest,
- using a two-tier key architecture for flexible and non-intrusive key rotation.
- First, the administrator provisions a cluster master key, stored into a key provider
- accessable to every trusted HBase process: the Master, the RegionServers, and clients (e.g.
- the shell) on administrative workstations. The default key provider integrates with the Java
- KeyStore API and any key management system with support for it. How HBase retrieves key
- material is configurable via the site file. The master key may be stored on the cluster
- servers, protected by a secure KeyStore file, or on an external keyserver, or in a hardware
- security module. This master key is resolved as needed by HBase processes through the
- configured key provider.
- Then, encryption keys can be specified in schema on a per column family basis, by
- creating or modifying a column descriptor to include two additional attributes: the name of
- the encryption algorithm to use (currently only "AES" is supported), and, optionally, a data
- key wrapped (encrypted) with the cluster master key. Per CF keys facilitates low impact
- incremental key rotation and reduces the scope of any external leak of key material. The
- wrapped data key is stored in the CF schema metadata, and in each HFile for the CF, encrypted
- with the cluster master key. Once the CF is configured for encryption, any new HFiles will be
- written encrypted. To insure encryption of all HFiles, trigger a major compaction after first
- enabling this feature. The key for decryption, encrypted with the cluster master key, is
- stored in the HFiles in a new meta block. At file open time the data key will be extracted
- from the HFile, decrypted with the cluster master key, and used for decryption of the
- remainder of the HFile. The HFile will be unreadable if the master key is not available.
- Should remote users somehow acquire access to the HFile data because of some lapse in HDFS
- permissions or from inappropriately discarded media, there will be no means to decrypt either
- the data key or the file data.
- Specifying a data key in the CF schema is optional. If one is not present, a random data
- key will be created for each HFile.
- A new configuration option for encrypting the WAL is also introduced. Even though WALs
- are transient, it is necessary to encrypt the WALEdits to avoid circumventing HFile
- protections for encrypted column families.
-
- Configuration
- Create a secret key of appropriate length for AES.
- \
- -genseckey -keyalg AES -keysize 128 \
- -alias
- ]]>
- where <password> is the password for the KeyStore file and <alias>is the
- user name of the HBase service account, typically "hbase". Simply press RETURN to store the
- key with the same password as the store. The resulting file should be distributed to all
- nodes running HBase daemons, with file ownership and permissions set to be readable only by
- the HBase service account.
- Configure HBase daemons to use a key provider backed by the KeyStore files for
- retrieving the cluster master key as needed.
- hbase.coprocessor.regionserver.classes
+ org.apache.hadoop/hbase.security.access.AccessController,
+ org.apache.hadoop.hbase.security.access.VisibilityController
+
+
+
+ hbase.security.exec.permission.checks
+ true
+
+
+
+ hbase.security.visibility.mutations.checkauth
+ false
+
+
+
+ hbase.rpc.protection
+ auth-conf
+
+
hbase.crypto.keyproviderorg.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider
@@ -1698,27 +1791,11 @@ $ keytool -keystore /path/to/hbase/conf/hbase.jks \
hbase.crypto.keyprovider.parametersjceks:///path/to/hbase/conf/hbase.jks?password=
- ]]>
- By default the HBase service account name will be used to resolve the cluster master
- key, but you can store it with any arbitrary alias and configure HBase appropriately:
- hbase.crypto.master.key.namehbase
- ]]>
- Because the password to the key store is sensitive information, the HBase site XML file
- should also have its permissions set to be readable only by the HBase service account.
- Transparent encryption is a feature of HFile version 3. Be sure to use HFile version 3
- by setting this property in every server site configuration file:
-
- hfile.format.version
- 3
-
- ]]>
- Finally, configure the secure WAL in every server site configuration file:
- hbase.regionserver.hlog.reader.implorg.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader
@@ -1731,53 +1808,64 @@ $ keytool -keystore /path/to/hbase/conf/hbase.jks \
hbase.regionserver.wal.encryptiontrue
+
+
+ hbase.crypto.master.alternate.key.name
+ hbase.old
+
+
+
+ hbase.bulkload.staging.dir
+ /tmp/hbase-staging
+
+
+ hbase.coprocessor.region.classes
+ org.apache.hadoop.hbase.security.token.TokenProvider,
+ org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint
+
]]>
-
-
- Setting Encryption on a CF
- To enable encryption on a CF, use HBaseAdmin#modifyColumn or the HBase
- shell to modify the column descriptor. The attribute 'ENCRYPTION' specifies the encryption
- algorithm to use. Currently only "AES" is supported. If creating a new table, simply set
- this attribute; no subsequent table modification will be necessary.
- If setting a specific data key, the attribute 'ENCRYPTION_KEY' should contain the data
- key wrapped by the cluster master key. The static methods wrapKey and
- unwrapKey in org.apache.hadoop.hbase.security.EncryptionUtil can
- be used in conjunction with HColumnDescriptor#setEncryptionKey for this
- purpose. Because this must be done programatically, setting a data key with the shell is not
- supported.
- To disable encryption on a CF, simply remove the 'ENCRYPTION' (and 'ENCRYPTION_KEY', if
- it was set) attributes from the column schema, using HBaseAdmin#modifyColumn or
- the HBase shell. All new HFiles for the CF will be written without encryption. Trigger a
- major compaction to rewrite all files.
-
-
- Data Key Rotation
- Data key rotation is made simple by this design. First, change the CF key in the column
- descriptor. Then, trigger major compaction. Once compaction has completed, all files will be
- (re)encrypted with the new key material. While this process is ongoing, HFiles encrypted
- with old key material will still be readable.
-
-
- Master Key Rotation
- Master key rotation can be achieved by updating the KeyStore to contain a new master
- key, as described above, with also the old master key added to the KeyStore under a
- different alias. Then, configure fallback to the old master key in the HBase site file:
+
+
+ Example Group Mapper in Hadoop core-site.xml
+ Adjust these settings to suit your environment.
- hbase.crypto.master.alternate.key.name
- hbase.old
+ hadoop.security.group.mapping
+ org.apache.hadoop.security.LdapGroupsMapping
+
+
+ hadoop.security.group.mapping.ldap.url
+ ldap://server
+
+
+ hadoop.security.group.mapping.ldap.bind.user
+ Administrator@example-ad.local
+
+
+ hadoop.security.group.mapping.ldap.bind.password
+ ****
+
+
+ hadoop.security.group.mapping.ldap.base
+ dc=example-ad,dc=local
+
+
+ hadoop.security.group.mapping.ldap.search.filter.user
+ (&(objectClass=user)(sAMAccountName={0}))
+
+
+ hadoop.security.group.mapping.ldap.search.filter.group
+ (objectClass=group)
+
+
+ hadoop.security.group.mapping.ldap.search.attr.member
+ member
+
+
+ hadoop.security.group.mapping.ldap.search.attr.group.name
+ cn
]]>
- This will require a rolling restart of the HBase daemons to take effect. As with data
- key rotation, trigger a major compaction and wait for it to complete. Once compaction has
- completed, all files will be (re)encrypted with data keys wrapped by the new cluster master
- key. The old master key, and its associated site file configuration, can then be removed,
- and all trace of the old master key will be gone after the next rolling restart. A second
- rolling restart is not immediately necessary.
-
+
-
diff --git src/main/site/resources/images/LDAPScanLabelGenerator.png src/main/site/resources/images/LDAPScanLabelGenerator.png
new file mode 100644
index 0000000..4fb67a5
Binary files /dev/null and src/main/site/resources/images/LDAPScanLabelGenerator.png differ