Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6136

Docker client library introduces Guava 17.0, which causes runtime binary incompatibilities

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.3.0
    • 1.3.0
    • SQL
    • None

    Description

      Integration test suites in the JDBC data source (MySQLIntegration and PostgresIntegration) depend on docker-client 2.7.5, which transitively depends on Guava 17.0. Unfortunately, Guava 17.0 is causing runtime binary incompatibility issues when Spark is compiled against Hadoop 2.4.

      $ ./build/sbt -Pyarn,hadoop-2.4,hive,hive-0.12.0,scala-2.10 -Dhadoop.version=2.4.1
      ...
      > sql/test-only *.ParquetDataSourceOffIOSuite
      ...
      [info] ParquetDataSourceOffIOSuite:
      [info] Exception encountered when attempting to run a suite with class name: org.apache.spark.sql.parquet.ParquetDataSourceOffIOSuite *** ABORTED *** (134 milliseconds)
      [info]   java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
      [info]   at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:261)
      [info]   at parquet.hadoop.ParquetInputFormat.listStatus(ParquetInputFormat.java:277)
      [info]   at org.apache.spark.sql.parquet.FilteringParquetRowInputFormat.getSplits(ParquetTableOperations.scala:437)
      [info]   at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95)
      [info]   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
      [info]   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
      [info]   at scala.Option.getOrElse(Option.scala:120)
      [info]   at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
      [info]   at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
      [info]   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
      [info]   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
      [info]   at scala.Option.getOrElse(Option.scala:120)
      [info]   at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
      [info]   at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
      [info]   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
      [info]   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
      [info]   at scala.Option.getOrElse(Option.scala:120)
      [info]   at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
      [info]   at org.apache.spark.SparkContext.runJob(SparkContext.scala:1525)
      [info]   at org.apache.spark.rdd.RDD.collect(RDD.scala:813)
      [info]   at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:83)
      [info]   at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:797)
      [info]   at org.apache.spark.sql.QueryTest$.checkAnswer(QueryTest.scala:115)
      [info]   at org.apache.spark.sql.QueryTest.checkAnswer(QueryTest.scala:60)
      [info]   at org.apache.spark.sql.parquet.ParquetIOSuiteBase$$anonfun$checkParquetFile$1.apply(ParquetIOSuite.scala:76)
      [info]   at org.apache.spark.sql.parquet.ParquetIOSuiteBase$$anonfun$checkParquetFile$1.apply(ParquetIOSuite.scala:76)
      [info]   at org.apache.spark.sql.parquet.ParquetTest$$anonfun$withParquetDataFrame$1.apply(ParquetTest.scala:105)
      [info]   at org.apache.spark.sql.parquet.ParquetTest$$anonfun$withParquetDataFrame$1.apply(ParquetTest.scala:105)
      [info]   at org.apache.spark.sql.parquet.ParquetTest$$anonfun$withParquetFile$1.apply(ParquetTest.scala:94)
      [info]   at org.apache.spark.sql.parquet.ParquetTest$$anonfun$withParquetFile$1.apply(ParquetTest.scala:92)
      [info]   at org.apache.spark.sql.parquet.ParquetTest$class.withTempPath(ParquetTest.scala:71)
      [info]   at org.apache.spark.sql.parquet.ParquetIOSuiteBase.withTempPath(ParquetIOSuite.scala:67)
      [info]   at org.apache.spark.sql.parquet.ParquetTest$class.withParquetFile(ParquetTest.scala:92)
      [info]   at org.apache.spark.sql.parquet.ParquetIOSuiteBase.withParquetFile(ParquetIOSuite.scala:67)
      [info]   at org.apache.spark.sql.parquet.ParquetTest$class.withParquetDataFrame(ParquetTest.scala:105)
      [info]   at org.apache.spark.sql.parquet.ParquetIOSuiteBase.withParquetDataFrame(ParquetIOSuite.scala:67)
      [info]   at org.apache.spark.sql.parquet.ParquetIOSuiteBase.checkParquetFile(ParquetIOSuite.scala:76)
      [info]   at org.apache.spark.sql.parquet.ParquetIOSuiteBase$$anonfun$1.apply$mcV$sp(ParquetIOSuite.scala:83)
      [info]   at org.apache.spark.sql.parquet.ParquetIOSuiteBase$$anonfun$1.apply(ParquetIOSuite.scala:79)
      [info]   at org.apache.spark.sql.parquet.ParquetIOSuiteBase$$anonfun$1.apply(ParquetIOSuite.scala:79)
      [info]   at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
      [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
      [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
      [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
      [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
      [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
      [info]   at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
      [info]   at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
      [info]   at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
      [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
      [info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
      [info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
      [info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
      [info]   at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
      [info]   at scala.collection.immutable.List.foreach(List.scala:318)
      [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
      [info]   at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
      [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
      [info]   at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
      [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
      [info]   at org.scalatest.Suite$class.run(Suite.scala:1424)
      [info]   at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
      [info]   at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
      [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
      [info]   at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
      [info]   at org.apache.spark.sql.parquet.ParquetDataSourceOffIOSuite.org$scalatest$BeforeAndAfterAll$$super$run(ParquetIOSuite.scala:346)
      [info]   at org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:257)
      [info]   at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:256)
      [info]   at org.apache.spark.sql.parquet.ParquetDataSourceOffIOSuite.run(ParquetIOSuite.scala:346)
      [info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:462)
      [info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:671)
      [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:294)
      [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:284)
      [info]   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      [info]   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      [info]   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      [info]   at java.lang.Thread.run(Thread.java:745)
      

      This is because the default constructor of Stopwatch is no longer public in Guava 17.0.

      Compiling Spark against Hive 0.12.0 also introduces other types of runtime binary incompatibility issue.

      Considering MySQLIntegration and PostgresIntegration are ignored right now, I'd suggest moving them from the Spark project to the [Spark integration tests||https://github.com/databricks/spark-integration-tests] project.

      Attachments

        Activity

          People

            lian cheng Cheng Lian
            lian cheng Cheng Lian
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: