Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28626

Spark leaves unencrypted data on local disk, even with encryption turned on (CVE-2019-10099)

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.2
    • Fix Version/s: 2.3.3, 2.4.0
    • Component/s: Security
    • Labels:
      None

      Description

      Severity: Important

       

      Vendor: The Apache Software Foundation

       

      Versions affected:

      All Spark 1.x, Spark 2.0.x, Spark 2.1.x, and 2.2.x versions

      Spark 2.3.0 to 2.3.2

       

      Description:

      Prior to Spark 2.3.3, in certain situations Spark would write user data to local disk unencrypted, even if spark.io.encryption.enabled=true.  This includes cached blocks that are fetched to disk (controlled by spark.maxRemoteBlockSizeFetchToMem); in SparkR, using parallelize; in Pyspark, using broadcast and parallelize; and use of python udfs.

       

       

      Mitigation:

      1.x, 2.0.x, 2.1.x, 2.2.x, 2.3.x  users should upgrade to 2.3.3 or newer, including 2.4.x

       

      Credit:

      This issue was reported by Thomas Graves of NVIDIA.

       

      References:

      https://spark.apache.org/security.html

       

      The following commits were used to fix this issue, in branch-2.3 (there may be other commits in master / branch-2.4, that are equivalent.)

      commit 575fea120e25249716e3f680396580c5f9e26b5b
      Author: Imran Rashid <irashid@cloudera.com>
      Date:   Wed Aug 22 16:38:28 2018 -0500
      
          [CORE] Updates to remote cache reads
      
          Covered by tests in DistributedSuite
      
       
      commit 6d742d1bd71aa3803dce91a830b37284cb18cf70
      Author: Imran Rashid <irashid@cloudera.com>
      Date:   Thu Sep 6 12:11:47 2018 -0500
      
          [PYSPARK][SQL] Updates to RowQueue
      
          Tested with updates to RowQueueSuite
      
       
      commit 09dd34cb1706f2477a89174d6a1a0f17ed5b0a65
      Author: Imran Rashid <irashid@cloudera.com>
      Date:   Mon Aug 13 21:35:34 2018 -0500 
      
          [PYSPARK] Updates to pyspark broadcast
      
       
      commit 12717ba0edfa5459c9ac2085f46b1ecc0ee759aa
      Author: hyukjinkwon <gurwls223@apache.org>
      Date:   Mon Sep 24 19:25:02 2018 +0800 
      
          [SPARKR] Match pyspark features in SparkR communication protocol
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              irashid Imran Rashid
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: