Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
2.6.0
-
None
-
None
-
Kerberized, HA cluster, iNotify client, CDH5.7.0
Description
When a NameNode serves iNotify requests from a client, it verifies the client has superuser permission and then uses the client's Kerberos principal to read edits from journal nodes.
However, if the client does not renew its tgt tickets, the connection from NameNode to journal nodes may fail. In which case, the NameNode thinks the edits are corrupt, and prints a scary error message:
"During automatic edit log failover, we noticed that all of the remaining edit log streams are shorter than the current one! The best remaining edit log ends at transaction 11577603, but we thought we could read up to transaction 11577606. If you continue, metadata will be lost forever!"
However, the edits are actually good. NameNode should not freak out when an iNotify client's tgt ticket expires.
I think that an easy solution to this bug, is that after NameNode verifies client has superuser permission, call SecurityUtil.doAsLoginUser and then read edits. This will make sure the operation does not fail due to an expired client ticket.
Excerpt of related logs:
2016-08-18 19:05:13,979 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs@EXAMPLE.COM (auth:KERBEROS) cause:java.io.IOException: We encountered an error reading http://jn1.example.com:8480/getJournal?jid=nameservice1&segmentTxId=11577487&storageInfo=yyy, http://jn1.example.com:8480/getJournal?jid=nameservice1&segmentTxId=11577487&storageInfo=yyy. During automatic edit log failover, we noticed that all of the remaining edit log streams are shorter than the current one! The best remaining edit log ends at transaction 11577603, but we thought we could read up to transaction 11577606. If you continue, metadata will be lost forever! 2016-08-18 19:05:13,979 INFO org.apache.hadoop.ipc.Server: IPC Server handler 112 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getEditsFromTxid from [client IP:port] Call#73 Retry#0 java.io.IOException: We encountered an error reading http://jn1.example.com:8480/getJournal?jid=nameservice1&segmentTxId=11577487&storageInfo=yyy, http://jn1.example.com:8480/getJournal?jid=nameservice1&segmentTxId=11577487&storageInfo=yyy. During automatic edit log failover, we noticed that all of the remaining edit log streams are shorter than the current one! The best remaining edit log ends at transaction 11577603, but we thought we could read up to transaction 11577606. If you continue, metadata will be lost forever! at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:213) at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.readOp(NameNodeRpcServer.java:1674) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEditsFromTxid(NameNodeRpcServer.java:1736) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEditsFromTxid(AuthorizationProviderProxyClientProtocol.java:1010) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEditsFromTxid(ClientNamenodeProtocolServerSideTranslatorPB.java:1475) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
Attachments
Attachments
Issue Links
- is related to
-
HDFS-13040 Kerberized inotify client fails despite kinit properly
- Resolved
- relates to
-
HDFS-10643 Namenode should use loginUser(hdfs) to generateEncryptedKey
- Resolved