Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.3.0
Description
We observed that when a NameNode becomes UNAVAILABLE, the corresponding blockpool id in MembershipStoreImpl#activeNamespaces on dfsrouter unintentionally sets to empty, its initial value.
As a result of this, concat operations through dfsrouter fail with the following error as it cannot resolve the block id in the recognized active namespaces.
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): Cannot locate a nameservice for block pool BP-...
A possible fix is to ignore UNAVAILABLE NameNode registrations, and set proper namespace information obtained from available NameNode registrations when constructing the cache of active namespaces.
https://github.com/apache/hadoop/blob/rel/release-3.3.0/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/impl/MembershipStoreImpl.java#L207-L221
Attachments
Attachments
Issue Links
- causes
-
HDFS-15952 TestRouterRpcMultiDestination#testProxyGetTransactionID and testProxyVersionRequest are flaky
- Resolved
- links to