Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.12.0, 0.12.1, 0.13.0
Description
when i init pyarrow filesystem to connect hdfs clusfter in spark,the libhdfs throws error:
org/apache/hadoop/fs/FileSystem class not found
I print out the CLASSPATH, the classpath value is wildcard mode
../share/hadoop/hdfs;spark/spark-2.0.2-bin-hadoop2.7/jars...
The value is set by spark,but libhdfs must load class from jar files.
Root cause is:
In hdfs.py we just check the string ''hadoop" in classpath,but not jar file
def _maybe_set_hadoop_classpath(): if 'hadoop' in os.environ.get('CLASSPATH', ''): return
Attachments
Issue Links
- links to