diff --git src/docbkx/security.xml src/docbkx/security.xml index 066143a..ed4a0c2 100644 --- src/docbkx/security.xml +++ src/docbkx/security.xml @@ -495,4 +495,38 @@ The HBase shell has been extended to provide simple commands for editing and upd + +
+ Secure Bulk Load + + Bulk loading in secure mode is a bit more involved than normal setup, since the client has to transfer the ownership of the files generated from the mapreduce job to HBase. Secure bulk loading is implemented by a coprocessor, named SecureBulkLoadEndpoint. SecureBulkLoadEndpoint uses a staging directory "hbase.bulkload.staging.dir", which defaults to /tmp/hbase-staging/. The algorithm is as follows. + + Create an hbase owned staging directory which is world traversable (-rwx--x--x, 711) /tmp/hbase-staging. + A user writes out data to his secure output directory: /user/foo/data + A call is made to hbase to create a secret staging directory + which is globally readable/writable (-rwxrwxrwx, 777): /tmp/hbase-staging/averylongandrandomdirectoryname + The user makes the data world readable and writable, then moves it + into the random staging directory, then calls bulkLoadHFiles() + + + + Like delegation tokens the strength of the security lies in the length + and randomness of the secret directory. + + + + You have to enable the secure bulk load to work properly. You can modify the hbase-site.xml file on every server machine in the cluster and add the SecureBulkLoadEndpoint class to the list of regionserver coprocessors: + + + hbase.bulkload.staging.dir + /tmp/hbase-staging + + + hbase.coprocessor.region.classes + org.apache.hadoop.hbase.security.token.TokenProvider, + org.apache.hadoop.hbase.security.access.AccessController,org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint + + ]]> +