[HDDS-7097] Container scanner log output lacks useful information - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.4.0
Component/s: None
Labels:
- pull-request-available

Description

Currently the output from the container scanner may look like this

2022-08-04 14:16:37,702 WARN org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer: Moving container /hadoop-ozone/datanode/data/hdds/CID-5612c780-06f8-4ac5-9eae-498159abd009/current/containerDir1/1008 to state UNHEALTHY from state:UNHEALTHY Trace:java.base/java.lang.Thread.getStackTrace(Thread.java:1606)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.markContainerUnhealthy(KeyValueContainer.java:335)
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.markContainerUnhealthy(KeyValueHandler.java:1017)
org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.markContainerUnhealthy(ContainerController.java:116)
org.apache.hadoop.ozone.container.ozoneimpl.ContainerDataScanner.runIteration(ContainerDataScanner.java:108)
org.apache.hadoop.ozone.container.ozoneimpl.ContainerDataScanner.run(ContainerDataScanner.java:81)
...
2022-08-04 14:30:19,407 ERROR org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerCheck: Corruption detected in container: [2] Exception: [null]

There's numerous problems with this:

The previous container state is not logged. The new unhealthy state is incorrectly logged as the previous state.
The exception identifying the corruption only has its message printed. The exception object itself should be logged to better identify the failure and catch cases like above where there is no exception message (probably caused by a bug).
The stack trace of the call to KeyValueContainer#markContainerUnhealthy is logged, which both verbose and not useful.

Attachments

Issue Links

relates to

HDDS-7413 Fix logging while marking container state unhealthy

Resolved

links to

GitHub Pull Request #4169

Activity

People

Assignee:: Dave Teng

Reporter:: Ethan Rose

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 04/Aug/22 23:21

Updated:: 14/Dec/23 17:55

Resolved:: 23/Jan/23 17:58