Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.0.0
-
None
Description
It's not possible to open a `abfs://` or `abfss://` URI with the pyarrow.fs.HadoopFileSystem.
Using HadoopFileSystem.from_uri(path) does not work and libhdfs will throw an error saying that the authority is invalid (I checked that this is because the string is empty).
Note that the legacy pyarrow.hdfs.HadoopFileSystem interface works by doing for example:
- pyarrow.hdfs.HadoopFileSystem(host="abfs://xxx@xxx.dfs.core.windows.net")
- pyarrow.hdfs.connect(host="abfs://xxx@xxx.dfs.core.windows.net")
and I believe the new interface should work too by passing the full URI as "host" to `pyarrow.fs.HadoopFileSystem` constructor. However, the constructor wrongly prepends "hdfs://" at the beginning: https://github.com/apache/arrow/blob/25c736d48dc289f457e74d15d05db65f6d539447/python/pyarrow/_hdfs.pyx#L64