Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
3.3.1
-
None
-
None
Description
The library spark-hadoop-cloud is absent in the default Spark distribution (as well as its dependencies like hadoop-aws). Therefore the dependency management section described in Integration with Cloud Infrastructures is invalid. Actually the libraries for cloud integration are not provided.
A naive workaround would be to add the spark-hadoop-cloud library as a compile-scope dependency. However, this does not work due to Spark classpath hierarchy. Spark system classloader does not see classes loaded by the application classloader.
Therefore a proper fix would be to enable the hadoop-cloud build profile by default: -Phadoop-cloud