Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.1.1
-
None
-
None
Description
The very common scenario where customer jobs failed when writing into the "x" directory because the file limit on "x" reached the configured value controlled by dfs.namenode.fs-limits.max-directory-items.
Example:
The directory item limit of /tmp is exceeded: limit=1048576 items=1048576
I think we need to expose new metrics into "NameNodeMetrics" and add paths that exceed 90% of dfs.namenode.fs-limits.max-directory-items. However, higher costs when recomputing the path size and removing them from metrics on every delete.
So, Should we consider letting SNN handle this from updateCountForQuota? Anyways, updateCountForQuota often runs in SNN, so CM can query SNN and alert users when this path list is non-empty.
FSDirectory#verifyMaxDirItems.
/** * Verify children size for fs limit. * * @throws MaxDirectoryItemsExceededException too many children. */ void verifyMaxDirItems(INodeDirectory parent, String parentPath) throws MaxDirectoryItemsExceededException { final int count = parent.getChildrenList(CURRENT_STATE_ID).size(); if (count >= maxDirItems) { final MaxDirectoryItemsExceededException e = new MaxDirectoryItemsExceededException(parentPath, maxDirItems, count); if (namesystem.isImageLoaded()) { throw e; } else { // Do not throw if edits log is still being processed NameNode.LOG.error("FSDirectory.verifyMaxDirItems: " + e.getLocalizedMessage()); } } }