[YARN-8273] Log aggregation does not warn if HDFS quota in target directory is exceeded - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1.0
Fix Version/s: 3.2.0
Component/s: log-aggregation
Labels:
None

Target Version/s:

3.2.0
Hadoop Flags:

Reviewed

Description

It appears that if an HDFS space quota is set on a target directory for log aggregation and the quota is already exceeded when log aggregation is attempted, zero-byte log files will be written to the HDFS directory, however NodeManager logs do not reflect a failure to write the files successfully (i.e. there are no ERROR or WARN messages to this effect).

An improvement may be worth investigating to alert users to this scenario, as otherwise logs for a YARN application may be missing both on HDFS and locally (after local log cleanup is done) and the user may not otherwise be informed.

Steps to reproduce:

Set a small HDFS space quota on /tmp/logs/username/logs (e.g. 2MB)
Write files to HDFS such that /tmp/logs/username/logs is almost 2MB full
Run a Spark or MR job in the cluster
Observe that zero byte files are written to HDFS after job completion
Observe that YARN container logs are also not present on the NM hosts (or are deleted after yarn.nodemanager.delete.debug-delay-sec)
Observe that no ERROR or WARN messages appear to be logged in the NM role log

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

YARN-8273.000.patch
10/May/18 16:21
19 kB
Gergo Repas
YARN-8273.001.patch
15/May/18 10:07
17 kB
Gergo Repas
YARN-8273.002.patch
15/May/18 12:31
18 kB
Gergo Repas
YARN-8273.003.patch
17/May/18 15:54
22 kB
Gergo Repas
YARN-8273.004.patch
18/May/18 10:20
24 kB
Gergo Repas
YARN-8273.005.patch
22/May/18 08:24
24 kB
Gergo Repas
YARN-8273.006.patch
22/May/18 12:29
24 kB
Gergo Repas

Issue Links

breaks

YARN-8492 ATSv2 HBase tests are failing with ClassNotFoundException

Resolved

YARN-10648 NM local logs are not cleared after uploading to hdfs

Patch Available

Activity

People

Assignee:: Gergo Repas

Reporter:: Gergo Repas

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 10/May/18 16:19

Updated:: 23/Feb/21 09:54

Resolved:: 22/May/18 21:24