[SPARK-18535] Redact sensitive information from Spark logs and UI - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.1.0
Fix Version/s: 2.1.2, 2.2.0
Component/s: Spark Core, Web UI, YARN
Labels:
None

Description

A Spark user may have to provide a sensitive information for a Spark configuration property, or a source out an environment variable in the executor or driver environment that contains sensitive information. A good example of this would be when reading/writing data from/to S3 using Spark. The S3 secret and S3 access key can be placed in a hadoop credential provider. However, one still needs to provide the password for the credential provider to Spark, which is typically supplied as an environment variable to the driver and executor environments. This environment variable shows up in logs, and may also show up in the UI.

1. For logs, it shows up in a few places:
1A. Event logs under SparkListenerEnvironmentUpdate event.
1B. YARN logs, when printing the executor launch context.
2. For UI, it would show up in the Environment tab, but it is redacted if it contains the words "password" or "secret" in it. And, these magic words are hardcoded and hence not customizable.

This JIRA is to track the work to make sure sensitive information is redacted from all logs and UIs in Spark, while still being passed on to all relevant places it needs to get passed on to.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

redacted.png
22/Nov/16 00:30
36 kB
Mark Grover

Issue Links

is related to

SPARK-19720 Redact sensitive information from SparkSubmit console output

Resolved

links to

[Github] Pull Request #15971 (markgrover)

[Github] Pull Request #18802 (dmvieira)

Activity

People

Assignee:: Mark Grover

Reporter:: Mark Grover

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 22/Nov/16 00:11

Updated:: 12/Dec/22 18:10

Resolved:: 28/Nov/16 17:00