Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
From https://github.com/apache/arrow/issues/3748:
CLASSPATH discovery was recently changed in d911850 to resolve ARROW-2113 and ARROW-3768.
Specifically, the logic used to find all jars under HADOOP_HOME uses the find command directly
arrow/python/pyarrow/hdfs.py
Line 144 in d911850
find_args = ('find', os.environ['HADOOP_HOME'], '-name', '*.jar') |
This will not work when HADOOP_HOME is a symlink, in which case '-L' needs to be passed to the find command.
CLASSPATH can still be set explicitly, but this is a change in behavior as HADOOP_HOME symlinks worked without issue before.