Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6066

Metadata in event log makes it very difficult for external libraries to parse event log

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.3.0
    • 1.3.0
    • Spark Core
    • None

    Description

      The fix for SPARK-2261 added a line at the beginning of the event log that encodes metadata. This line makes it much more difficult to parse the event logs from external libraries (like https://github.com/kayousterhout/trace-analysis, which is used by folks at Berkeley) because:

      (1) The metadata is not written as JSON, unlike the rest of the file
      (2) More annoyingly, if the file is compressed, the metadata is not compressed. This has a few side-effects: first, someone can't just use the command line to uncompress the file and then look at the logs, because the file is in this weird half-compressed format; and second, now external tools that parse these logs also need to deal with this weird format.

      We should fix this before the 1.3 release, because otherwise we'll have to add a bunch more backward-compatibility code to handle this weird format!

      Attachments

        Activity

          People

            andrewor14 Andrew Or
            kayousterhout Kay Ousterhout
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: