Description
Saw this while examining a 1.5.0 RC, probably not important enough to warrant a new RC, but something we should fix anyway.
The kudu-client-tools JAR has Apache Commons and Parquet classes in it. The kudu-spark2-tools_2.11 JAR has Spark Avro, Apache Avro, Apache Commons, and other classes in it. These are all extraneous.
I believe these class inclusions were introduced in commit 5d53a3b, namely in these pom.xml changes:
diff --git a/java/kudu-client-tools/pom.xml b/java/kudu-client-tools/pom.xml index d4908fa..65ac4e3 100644 --- a/java/kudu-client-tools/pom.xml +++ b/java/kudu-client-tools/pom.xml @@ -86,6 +86,11 @@ <version>${slf4j.version}</version> <scope>test</scope> </dependency> + <dependency> + <groupId>org.apache.parquet</groupId> + <artifactId>parquet-hadoop</artifactId> + <version>${parquet.version}</version> + </dependency> </dependencies> diff --git a/java/kudu-spark-tools/pom.xml b/java/kudu-spark-tools/pom.xml index c2eb57f..98ffe28 100644 --- a/java/kudu-spark-tools/pom.xml +++ b/java/kudu-spark-tools/pom.xml @@ -98,6 +99,11 @@ <scope>test</scope> </dependency> <dependency> + <groupId>com.databricks</groupId> + <artifactId>spark-avro_2.10</artifactId> + <version>${sparkavro.version}</version> + </dependency> + <dependency>
Both of these new dependencies should probably be of scope 'provided'.