[YARN-1670] aggregated log writer can write more log data then it says is the log length - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: 0.23.10, 2.2.0, 3.0.0-alpha1
Fix Version/s: 0.23.11, 2.4.0
Component/s: None
Labels:
None

Target Version/s:

0.23.11, 2.4.0
Hadoop Flags:

Reviewed

Description

We have seen exceptions when using 'yarn logs' to read log files.
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:518)
at org.apache.hadoop.yarn.logaggregation.LogDumper.dumpAContainerLogs(LogDumper.java:178)
at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:130)
at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:246)

We traced it down to the reader trying to read the file type of the next file but where it reads is still log data from the previous file. What happened was the Log Length was written as a certain size but the log data was actually longer then that.

Inside of the write() routine in LogValue it first writes what the logfile length is, but then when it goes to write the log itself it just goes to the end of the file. There is a race condition here where if someone is still writing to the file when it goes to be aggregated the length written could be to small.

We should have the write() routine stop when it writes whatever it said was the length. It would be nice if we could somehow tell the user it might be truncated but I'm not sure of a good way to do this.

We also noticed that a bug in readAContainerLogsForALogType where it is using an int for curRead whereas it should be using a long.

while (len != -1 && curRead < fileLength) {

This isn't actually a problem right now as it looks like the underlying decoder is doing the right thing and the len condition exits.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

YARN-1670.patch
04/Mar/14 16:07
2 kB
Mit Desai
YARN-1670.patch
04/Mar/14 15:33
2 kB
Mit Desai
YARN-1670-b23.patch
04/Mar/14 15:32
2 kB
Mit Desai
YARN-1670-v2.patch
20/Mar/14 22:27
7 kB
Mit Desai
YARN-1670-v2-b23.patch
20/Mar/14 22:27
7 kB
Mit Desai
YARN-1670-v3.patch
21/Mar/14 15:58
6 kB
Mit Desai
YARN-1670-v3-b23.patch
21/Mar/14 15:54
6 kB
Mit Desai
YARN-1670-v4.patch
24/Mar/14 15:41
2 kB
Mit Desai
YARN-1670-v4.patch
24/Mar/14 02:48
7 kB
Mit Desai
YARN-1670-v4-b23.patch
24/Mar/14 15:40
6 kB
Mit Desai
YARN-1670-v4-b23.patch
24/Mar/14 02:45
6 kB
Mit Desai

Activity

People

Assignee:: Mit Desai

Reporter:: Thomas Graves

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 29/Jan/14 17:26

Updated:: 12/May/16 18:29

Resolved:: 24/Mar/14 18:28