[IMPALA-7178] Reduce logging for common data errors - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Backend
Labels:
- supportability

Epic Link:
SupportabilityRoadMap
Epic Color:
ghx-label-7

Description

Some data errors (for example out-of-range parquet timestamps) can dominate logs if a table contains a large number of rows with invalid data. If an error has its own error code (see common/thrift/generate_error_codes.py), then these errors are already aggregated to the user (RuntimeState::LogError()) for every query, but the logs will contain a new line for every occurrence. This is not too useful most of times, as the log lines will repeat the same information (the corrupt data itself is not logged as it can be sensitive information).

The best would to reduce logging without loosing information:

the first occurrence of an error should be logged (per query/fragment/table/file/column) to help the investigation of cases where the data error leads to other errors and to avoid breaking log analyzer tools that search for the current format
other occurrences can be aggregated, like "in query Q table T column C XY error occurred N times"

An extra goal is to avoid calling RuntimeState::LogError() for other occurrences than the first one, as RuntimeState::LogError() uses a (per fragment) lock.

Attachments

Issue Links

is related to

IMPALA-5845 Impala should de-duplicate row parsing error

Resolved

Activity

People

Assignee:: Csaba Ringhofer

Reporter:: Csaba Ringhofer

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 15/Jun/18 13:31

Updated:: 07/Jul/20 13:20