It looks like Spark 3.2.0's POMs are no longer "dependency reduced". As a result, applications may pull in additional unnecessary dependencies when depending on Spark.
Spark uses the Maven Shade plugin to create effective POMs and to bundle shaded versions of certain libraries with Spark (namely, Jetty, Guava, and JPPML). By default, the Maven Shade plugin generates simplified POMs which remove dependencies on artifacts that have been shaded.
As a result, the generated POMs now include compile-scope dependencies on the shaded libraries. For example, compare the org.eclipse.jetty dependencies in:
- Spark 3.1.2: https://repo1.maven.org/maven2/org/apache/spark/spark-core_2.12/3.1.2/spark-core_2.12-3.1.2.pom
- Spark 3.2.0 RC2: https://repository.apache.org/content/repositories/orgapachespark-1390/org/apache/spark/spark-core_2.12/3.2.0/spark-core_2.12-3.2.0.pom
I think we should revert back to generating "dependency reduced" POMs to ensure that Spark declares a proper set of dependencies and to avoid "unknown unknown" consequences of changing our generated POM format.
/cc Chao Sun