Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19720

Redact sensitive information from SparkSubmit console output

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.2.0
    • Component/s: Spark Submit
    • Labels:
      None

      Description

      SPARK-18535 took care of redacting sensitive information from Spark event logs and UI. However, it intentionally didn't bother redacting the same sensitive information from SparkSubmit's console output because it was on the client's machine, which already had the sensitive information on disk (in spark-defaults.conf) or on terminal (spark-submit command line).

      However, it seems now that it's better to redact information from SparkSubmit's console output as well because orchestration software like Oozie usually expose SparkSubmit's console output via a UI. To make matters worse, Oozie, in particular, always sets the --verbose flag on SparkSubmit invocation, making the sensitive information readily available in its UI (see code here).

      This is a JIRA for tracking redaction of sensitive information from SparkSubmit's console output.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                mgrover Mark Grover
                Reporter:
                mgrover Mark Grover
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: