Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47240

SPIP: Structured Logging Framework for Apache Spark

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • Spark Core

    Description

      This proposal aims to enhance Apache Spark's logging system by implementing structured logging. This transition will change the format of the default log files from plain text to JSON, making them more accessible and analyzable. The new logs will include crucial identifiers such as worker, executor, query, job, stage, and task IDs, thereby making the logs more informative and facilitating easier search and analysis.

      Current Logging Format

      The current format of Spark logs is plain text, which can be challenging to parse and analyze efficiently. An example of the current log format is as follows:

      23/11/29 17:53:44 ERROR BlockManagerMasterEndpoint: Fail to know the executor 289 is alive or not.
      org.apache.spark.SparkException: Exception thrown in awaitResult: 
          <stacktrace…>
      Caused by: org.apache.spark.rpc.RpcEndpointNotFoundException: ..
      

      Proposed Structured Logging Format

      The proposed change involves structuring the logs in JSON format, which organizes the log information into easily identifiable fields. Here is how the new structured log format would look:

      {
         "ts":"23/11/29 17:53:44",
         "level":"ERROR",
         "msg":"Fail to know the executor 289 is alive or not",
         "context":{
             "executor_id":"289"
         },
         "exception":{
            "class":"org.apache.spark.SparkException",
            "msg":"Exception thrown in awaitResult",
            "stackTrace":"..."
         },
         "source":"BlockManagerMasterEndpoint"
      } 

      This format will enable users to upload and directly query driver/executor/master/worker log files using Spark SQL for more effective problem-solving and analysis, such as tracking executor losses or identifying faulty tasks:

      spark.read.json("hdfs://hdfs_host/logs").createOrReplaceTempView("logs")
      /* To get all the executor lost logs */
      SELECT * FROM logs WHERE contains(message, 'Lost executor');
      /* To get all the distributed logs about executor 289 */
      SELECT * FROM logs WHERE executor_id = 289;
      /* To get all the errors on host 100.116.29.4 */
      SELECT * FROM logs WHERE host = "100.116.29.4" and log_level="ERROR";
      

       

      SPIP doc: https://docs.google.com/document/d/1rATVGmFLNVLmtxSpWrEceYm7d-ocgu8ofhryVs4g3XU/edit?usp=sharing

      Attachments

        Activity

          People

            Unassigned Unassigned
            Gengliang.Wang Gengliang Wang
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: