Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
1.5.0
-
None
Description
I observed some perplexing errors while running $SPARK_HOME/bin/spark-shell yesterday (with $SPARK_HOME pointing at a clean 1.5.0 install):
java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.s3.S3FileSystem not found
while initializing HiveContext; full example output is here.
The issue was that a stray META-INF directory from some other project I'd built months ago was sitting in the directory that I'd run spark-shell from (not in my $SPARK_HOME, just in the directory I happened to be in when I ran $SPARK_HOME/bin/spark-shell).
That META-INF had a services/org.apache.hadoop.fs.FileSystem file specifying some provider classes (S3FileSystem in the example above) that were unsurprisingly not resolvable by Spark.
I'm not sure if this is purely my fault for attempting to run Spark from a directory with another project's config files laying around, but I find it somewhat surprising that, given a $SPARK_HOME pointing to a clean Spark install, that $SPARK_HOME/bin/spark-shell picks up detritus from the cwd it is called from, so I wanted to at least document it here.
Attachments
Issue Links
- relates to
-
HADOOP-12636 Prevent ServiceLoader failure init for unused FileSystems
- Closed