Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.8.0, 3.0.0-alpha-2, 2.4.5
-
None
-
Reviewed
-
Allow any configuration for the remote cluster in HFileOutputFormat2 that could be useful the different configuration from the job's configuration is necessary to connect the remote cluster, for instance, non-secure vs secure.
Description
We introduced support to generate hfile with good locality for a remote cluster even in HBASE-25608.
I realized we need to override other configurations for the remote cluster in addition to the zookeeper cluster key.
For example, read from a non-secure cluster and write hfiles for a secure cluster.
In this case, we use TableInputFormat for non-secure cluster with hbase.security.authentication=simple in job configuration.
So HFileOutputFormat failed to connect to remote secure cluster because requires hbase.security.authentication=kerberos in job conf.
Thus let's introduce configuration override for remote-cluster-aware HFileOutputFormat locality-sensitive feature.
Another example is to read from a secure cluster (A) and write hfiles for another secure cluster (B) and we use different principal for each cluster.
For instance, we use cluster-a/_HOST@EXAMPLE.COM for A and cluster-b/_HOST@EXAMPLE.COM for B.
Then we need to override MASTER_KRB_PRINCIPAL and REGIONSERVER_KRB_PRINCIPAL using cluster-b/_HOST@EXAMPLE.COM to connect cluster B.
^ This is not truth, we use token based digest auth in mapper/reducer, so principal difference for kerberos should be fine
Attachments
Issue Links
- links to