Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33227

Add Jar with Azure SAS token fails with URL encoded characters

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.4.3
    • Fix Version/s: None
    • Component/s: Spark Submit
    • Labels:
      None

      Description

      I am running spark-submit using an Azure SAS token to access the jar file. When the sig of the SAS token contains URL encoded characters before the end, I get a 403 error trying to download the jar. It appears to be related to the URL encoding change that occurs within DependencyUtils: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala#L137.

      Error message:

      + exec /usr/local/bin/tini -s – /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=10.0.0.44 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class MyClass 'https://storageaccount.blob.core.windows.net/blob/my-jar.jar?sv=2019-12-12&ss=b&srt=sco&sp=r&se=****&st=******&spr=https&sig=sigwith%2Band%2Fending%3D'

      ava.io.IOException: Server returned HTTP response code: 403 for URL: https://storageaccount.blob.core.windows.net/blob/ivm-0.2.40-Spark-2.2.jar?sv=2019-12-12&ss=b&srt=sco&sp=r&se=**********&st=*********&spr=https&sig=sigwith+and/ending= at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1900) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:268) at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:713) at org.apache.spark.deploy.DependencyUtils$.downloadFile(DependencyUtils.scala:137) at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:367) at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:367) at scala.Option.map(Option.scala:146) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:366) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:143) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

      It may not be clear in the example above, but when I submit the sas token url, it looks like:

      sig=sigwith%2Band%2Fending%3D

      The 403 error from the stacktrace gives

      sig=sigwith+and/ending=

      Is there something I can do to ensure that these characters do not get URL decoded in this way?

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              mcshane James McShane
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: