Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Similar to "hadoop.security.group.mapping.ldap.directory.search.timeout" we need timeout to be set for group lookup call in other " group mapping service providers" such as org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback and org.apache.hadoop.security.ShellBasedUnixGroupsMapping.
Currently the group lookup delay hold locks for long time and crashes the Namenode. This is to timeout the call and send the user the failure of operation due to group lookup is delayed.
2023-03-01 18:49:25,367 WARN org.apache.hadoop.security.Groups: Potential performance problem: getGroups(user=XXXXXXXXXX) took 232236 milliseconds. 2023-03-01 18:49:25,368 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of suppressed read-lock reports: 21 Longest read-lock held at 1970-01-11 13:29:34,218+0100 for 232236ms via java.lang.Thread.getStackTrace(Thread.java:1564)
Along with longest lock , we could also consider printing a message, only if all handlers are waiting due to current lock which might cause a failover/crash due to ha timeout