Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
4.0.0
-
None
Description
This proposal aims to enhance Apache Spark's logging system by implementing structured logging. This transition will change the format of the default log files from plain text to JSON, making them more accessible and analyzable. The new logs will include crucial identifiers such as worker, executor, query, job, stage, and task IDs, thereby making the logs more informative and facilitating easier search and analysis.
Current Logging Format
The current format of Spark logs is plain text, which can be challenging to parse and analyze efficiently. An example of the current log format is as follows:
23/11/29 17:53:44 ERROR BlockManagerMasterEndpoint: Fail to know the executor 289 is alive or not. org.apache.spark.SparkException: Exception thrown in awaitResult: <stacktrace…> Caused by: org.apache.spark.rpc.RpcEndpointNotFoundException: ..
Proposed Structured Logging Format
The proposed change involves structuring the logs in JSON format, which organizes the log information into easily identifiable fields. Here is how the new structured log format would look:
{ "ts":"23/11/29 17:53:44", "level":"ERROR", "msg":"Fail to know the executor 289 is alive or not", "context":{ "executor_id":"289" }, "exception":{ "class":"org.apache.spark.SparkException", "msg":"Exception thrown in awaitResult", "stackTrace":"..." }, "source":"BlockManagerMasterEndpoint" }
This format will enable users to upload and directly query driver/executor/master/worker log files using Spark SQL for more effective problem-solving and analysis, such as tracking executor losses or identifying faulty tasks:
spark.read.json("hdfs://hdfs_host/logs").createOrReplaceTempView("logs") /* To get all the executor lost logs */ SELECT * FROM logs WHERE contains(message, 'Lost executor'); /* To get all the distributed logs about executor 289 */ SELECT * FROM logs WHERE executor_id = 289; /* To get all the errors on host 100.116.29.4 */ SELECT * FROM logs WHERE host = "100.116.29.4" and log_level="ERROR";
SPIP doc: https://docs.google.com/document/d/1rATVGmFLNVLmtxSpWrEceYm7d-ocgu8ofhryVs4g3XU/edit?usp=sharing