Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
trunk
-
None
-
None
Description
OOZIE-1906 added znode cleanup thread.
currently passing Reaper.Mode.REAP_INDEFINITELY, but this enforce Oozie server to keep reaping znode even after znode is cleaned up. (https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/Reaper.java)
This adds memory pressure on oozie server. Need to change to REAP_UNTIL_GONE or REAP_UNTIL_DELETE
reaper = new ChildReaper(zk.getClient(), LOCKS_NODE, Reaper.Mode.REAP_INDEFINITELY, getExecutorService(), ConfigurationService.getInt(services.getConf(), REAPING_THRESHOLD) * 1000, REAPING_LEADER_PATH);
we hit one scenario where one ZK quorum slows down for short period, causing many Zk locks not released properly, right after ChildReaper (every 5 min ) runs, which keep checking the list of Znode ever since, in the end, Oozie server hit OOM.