Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21339

spark-shell --packages option does not add jars to classpath on windows

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.1.1
    • Fix Version/s: 2.2.1, 2.3.0
    • Component/s: Spark Shell, Windows
    • Labels:
      None
    • Environment:

      Windows 10 Enterprise x64

    • Flags:
      Important

      Description

      I am unable to import symbols from dependencies specified with the packages option:

      spark-shell --packages "com.datastax.spark:spark-cassandra-connector_2.11:2.0.2" --conf spark.jars.ivy="c:/tmp/ivy2" --verbose
      

      This results in:

      scala> import com.datastax.spark.connector._
      <console>:23: error: object datastax is not a member of package com
             import com.datastax.spark.connector._
                        ^
      

      NOTE: It is working as expected when running on Linux but not on Windows.

      Complete verbose output:

      > spark-shell --packages "com.datastax.spark:spark-cassandra-connector_2.11:2.0.2" --conf spark.jars.ivy="c:/tmp/ivy2" --
      verbose
      Using properties file: null
      Parsed arguments:
        master                  local[*]
        deployMode              null
        executorMemory          null
        executorCores           null
        totalExecutorCores      null
        propertiesFile          null
        driverMemory            null
        driverCores             null
        driverExtraClassPath    null
        driverExtraLibraryPath  null
        driverExtraJavaOptions  null
        supervise               false
        queue                   null
        numExecutors            null
        files                   null
        pyFiles                 null
        archives                null
        mainClass               org.apache.spark.repl.Main
        primaryResource         spark-shell
        name                    Spark shell
        childArgs               []
        jars                    null
        packages                com.datastax.spark:spark-cassandra-connector_2.11:2.0.2
        packagesExclusions      null
        repositories            null
        verbose                 true
      
      Spark properties used, including those specified through
       --conf and those from the properties file null:
        spark.jars.ivy -> c:/tmp/ivy2
      
      
      Ivy Default Cache set to: c:\tmp\ivy2\cache
      The jars for the packages stored in: c:\tmp\ivy2\jars
      :: loading settings :: url = jar:file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
      com.datastax.spark#spark-cassandra-connector_2.11 added as a dependency
      :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
              confs: [default]
              found com.datastax.spark#spark-cassandra-connector_2.11;2.0.2 in local-m2-cache
              found com.twitter#jsr166e;1.1.0 in local-m2-cache
              found commons-beanutils#commons-beanutils;1.9.3 in central
              found commons-collections#commons-collections;3.2.2 in local-m2-cache
              found org.joda#joda-convert;1.2 in local-m2-cache
              found joda-time#joda-time;2.3 in central
              found io.netty#netty-all;4.0.33.Final in local-m2-cache
              found org.scala-lang#scala-reflect;2.11.8 in local-m2-cache
      :: resolution report :: resolve 378ms :: artifacts dl 8ms
              :: modules in use:
              com.datastax.spark#spark-cassandra-connector_2.11;2.0.2 from local-m2-cache in [default]
              com.twitter#jsr166e;1.1.0 from local-m2-cache in [default]
              commons-beanutils#commons-beanutils;1.9.3 from central in [default]
              commons-collections#commons-collections;3.2.2 from local-m2-cache in [default]
              io.netty#netty-all;4.0.33.Final from local-m2-cache in [default]
              joda-time#joda-time;2.3 from central in [default]
              org.joda#joda-convert;1.2 from local-m2-cache in [default]
              org.scala-lang#scala-reflect;2.11.8 from local-m2-cache in [default]
              ---------------------------------------------------------------------
              |                  |            modules            ||   artifacts   |
              |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
              ---------------------------------------------------------------------
              |      default     |   8   |   0   |   0   |   0   ||   8   |   0   |
              ---------------------------------------------------------------------
      :: retrieving :: org.apache.spark#spark-submit-parent
              confs: [default]
              0 artifacts copied, 8 already retrieved (0kB/11ms)
      Main class:
      org.apache.spark.repl.Main
      Arguments:
      
      System properties:
      spark.jars.ivy -> c:/tmp/ivy2
      SPARK_SUBMIT -> true
      spark.app.name -> Spark shell
      spark.jars -> file:/c:/tmp/ivy2/jars/com.datastax.spark_spark-cassandra-connector_2.11-2.0.2.jar,file:/c:/tmp/ivy2/jars/com.twitter_jsr166e-1.1.0.jar,file:/c:/tmp/ivy2/jars/commons-beanutils_commons-beanutils-1.9.3.jar,file:/c:/tmp/ivy2/jars/org.joda_joda-convert-1.2.jar,file:/c:/tmp/ivy2/jars/joda-time_joda-time-2.3.jar,file:/c:/tmp/ivy2/jars/io.netty_netty-all-4.0.33.Final.jar,file:/c:/tmp/ivy2/jars/org.scala-lang_scala-reflect-2.11.8.jar,file:/c:/tmp/ivy2/jars/commons-collections_commons-collections-3.2.2.jar
      spark.submit.deployMode -> client
      spark.master -> local[*]
      Classpath elements:
      c:\tmp\ivy2\jars\com.datastax.spark_spark-cassandra-connector_2.11-2.0.2.jar
      c:\tmp\ivy2\jars\com.twitter_jsr166e-1.1.0.jar
      c:\tmp\ivy2\jars\commons-beanutils_commons-beanutils-1.9.3.jar
      c:\tmp\ivy2\jars\org.joda_joda-convert-1.2.jar
      c:\tmp\ivy2\jars\joda-time_joda-time-2.3.jar
      c:\tmp\ivy2\jars\io.netty_netty-all-4.0.33.Final.jar
      c:\tmp\ivy2\jars\org.scala-lang_scala-reflect-2.11.8.jar
      c:\tmp\ivy2\jars\commons-collections_commons-collections-3.2.2.jar
      
      
      Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
      Setting default log level to "WARN".
      To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
      17/07/07 15:45:20 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      17/07/07 15:45:28 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/bin/../jars/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/jars/datanucleus-core-3.2.10.jar."
      17/07/07 15:45:28 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/bin/../jars/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/jars/datanucleus-api-jdo-3.2.6.jar."
      17/07/07 15:45:28 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/jars/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/bin/../jars/datanucleus-rdbms-3.2.9.jar."
      17/07/07 15:45:31 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
      Spark context Web UI available at http://192.168.56.1:4040
      Spark context available as 'sc' (master = local[*], app id = local-1499435127578).
      Spark session available as 'spark'.
      Welcome to
            ____              __
           / __/__  ___ _____/ /__
          _\ \/ _ \/ _ `/ __/  '_/
         /___/ .__/\_,_/_/ /_/\_\   version 2.1.1
            /_/
      
      Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_121)
      Type in expressions to have them evaluated.
      Type :help for more information.
      
      scala>
      
      scala> import com.datastax.spark.connector._
      <console>:23: error: object datastax is not a member of package com
             import com.datastax.spark.connector._
                        ^
      

      The behaviour is different when we add the downloaded jar explicitly to the classpath with spark.driver.extraClassPath

      spark-shell --conf spark.driver.extraClassPath="c:\tmp\ivy2\jars\com.datastax.spark_spark-cassandra-connector_2.11-2.0.2.jar" --verbose
      

      Complete output:

      spark-shell --conf spark.driver.extraClassPath="c:\tmp\ivy2\jars\com.data
      stax.spark_spark-cassandra-connector_2.11-2.0.2.jar" --verbose
      Using properties file: null
      Parsed arguments:
        master                  local[*]
        deployMode              null
        executorMemory          null
        executorCores           null
        totalExecutorCores      null
        propertiesFile          null
        driverMemory            null
        driverCores             null
        driverExtraClassPath    c:\tmp\ivy2\jars\com.datastax.spark_spark-cassandra-connector_2.11-2.0.2.jar
        driverExtraLibraryPath  null
        driverExtraJavaOptions  null
        supervise               false
        queue                   null
        numExecutors            null
        files                   null
        pyFiles                 null
        archives                null
        mainClass               org.apache.spark.repl.Main
        primaryResource         spark-shell
        name                    Spark shell
        childArgs               []
        jars                    null
        packages                null
        packagesExclusions      null
        repositories            null
        verbose                 true
      
      Spark properties used, including those specified through
       --conf and those from the properties file null:
        spark.driver.extraClassPath -> c:\tmp\ivy2\jars\com.datastax.spark_spark-cassandra-connector_2.11-2.0.2.jar
      
      
      Main class:
      org.apache.spark.repl.Main
      Arguments:
      
      System properties:
      SPARK_SUBMIT -> true
      spark.app.name -> Spark shell
      spark.jars ->
      spark.submit.deployMode -> client
      spark.master -> local[*]
      spark.driver.extraClassPath -> c:\tmp\ivy2\jars\com.datastax.spark_spark-cassandra-connector_2.11-2.0.2.jar
      Classpath elements:
      
      
      
      Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
      Setting default log level to "WARN".
      To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
      17/07/07 16:05:16 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      17/07/07 16:05:18 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/bin/../jars/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/jars/datanucleus-core-3.2.10.jar."
      17/07/07 16:05:18 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/bin/../jars/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/jars/datanucleus-api-jdo-3.2.6.jar."
      17/07/07 16:05:18 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/jars/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/hadoop/spark-2.1.1-bin-hadoop2.7/bin/../jars/datanucleus-rdbms-3.2.9.jar."
      17/07/07 16:05:21 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
      Spark context Web UI available at http://192.168.56.1:4040
      Spark context available as 'sc' (master = local[*], app id = local-1499436317287).
      Spark session available as 'spark'.
      Welcome to
            ____              __
           / __/__  ___ _____/ /__
          _\ \/ _ \/ _ `/ __/  '_/
         /___/ .__/\_,_/_/ /_/\_\   version 2.1.1
            /_/
      
      Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_121)
      Type in expressions to have them evaluated.
      Type :help for more information.
      
      scala> import com.datastax.spark.connector._
      import com.datastax.spark.connector._
      
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                devaraj.k Devaraj K
                Reporter:
                gblankendal Goran Blankendal
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: