Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
We ran into an interesting bug when test teams were running HBase against cloud storage without ensuring that the previous location was cleaned. This resulted in an hbase.rootdir that had:
- A valid HBase MasterData Region
- A valid hbase:meta
- A valid collection of HBase tables
- An empty ZooKeeper
Through the changes that we've worked on prior, those described in HBASE-24286 were effective in getting every except the Procedures back online without issue. Parsing the existing procedures produced an interesting error:
java.lang.IllegalArgumentException: Illegal principal name hbase/wrong-hostname.domain@WRONG_REALM: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to hbase/wrong-hostname.domain@WRONG_REALM at org.apache.hadoop.security.User.<init>(User.java:51) at org.apache.hadoop.security.User.<init>(User.java:43) at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1418) at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1402) at org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.toUserInfo(MasterProcedureUtil.java:60) at org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.deserializeStateData(ModifyTableProcedure.java:262) at org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:294) at org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43) at org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:411) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:78) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.load(ProcedureExecutor.java:339) at org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:285) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:330) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:600) at org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1581) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:835) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2205) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:514) at java.lang.Thread.run(Thread.java:750)
What's actually happening is that we are storing the User into the procedure and then relying on UserGroupInformation to parse the User protobuf into a UGI to get the "short" username.
When the serialized procedure (whether in the MasterData region over via PV2 WAL files, I think) gets loaded, we end up needing Hadoop auth_to_local configuration to be able to parse that kerberos principal back to a name. However, Hadoop's KerberosName will only unwrap Kerberos principals which match the local Kerberos realm (defined by the krb5.conf's default_realm, ref)
The interesting part is that we don't seem to ever use the user other than to display the owner attribute for procedures on the HBase UI. There is a method in hbase-procedure which can filter procedures based on Owner, but I didn't see any usages of that method.
Given the pushback against HBASE-24286, I assume that, for the same reasons, we would see pushback against fixing this issue. However, I wanted to call it out for posterity. The expectation of users is that HBase should implicitly handle this case.
Attachments
Issue Links
- is related to
-
HBASE-24286 HMaster won't become healthy after after cloning or creating a new cluster pointing at the same file system
- Resolved