Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Since https://github.com/apache/arrow/pull/6597, local relative paths don't work anymore:
In [1]: import pyarrow.dataset as ds In [2]: ds.dataset("test.parquet") --------------------------------------------------------------------------- ArrowInvalid Traceback (most recent call last) <ipython-input-2-23ecfce52d13> in <module> ----> 1 ds.dataset("test.parquet") ~/scipy/repos/arrow/python/pyarrow/dataset.py in dataset(paths_or_factories, filesystem, partitioning, format) 327 328 if isinstance(paths_or_factories, str): --> 329 return factory(paths_or_factories, **kwargs).finish() 330 331 if not isinstance(paths_or_factories, list): ~/scipy/repos/arrow/python/pyarrow/dataset.py in factory(path_or_paths, filesystem, partitioning, format) 246 factories = [] 247 for path in path_or_paths: --> 248 fs, paths_or_selector = _ensure_fs_and_paths(path, filesystem) 249 factories.append(FileSystemDatasetFactory(fs, paths_or_selector, 250 format, options)) ~/scipy/repos/arrow/python/pyarrow/dataset.py in _ensure_fs_and_paths(path, filesystem) 165 from pyarrow.fs import FileType, FileSelector 166 --> 167 filesystem, path = _ensure_fs(filesystem, _stringify_path(path)) 168 infos = filesystem.get_target_infos([path])[0] 169 if infos.type == FileType.Directory: ~/scipy/repos/arrow/python/pyarrow/dataset.py in _ensure_fs(filesystem, path) 158 if filesystem is not None: 159 return filesystem, path --> 160 return FileSystem.from_uri(path) 161 162 ~/scipy/repos/arrow/python/pyarrow/_fs.pyx in pyarrow._fs.FileSystem.from_uri() ~/scipy/repos/arrow/python/pyarrow/error.pxi in pyarrow.lib.pyarrow_internal_check_status() ~/scipy/repos/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status() ArrowInvalid: URI has empty scheme: 'test.parquet'
apitrou Is this something that should be fixed in FileSystemFromUriOrPath or rather on the python side? (FileSystem.from_uri ensures to get the absolute path for Pathlib objects, but not for strings)