Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10998

[C++] Filesystems: detect if URI is passed where a file path is required and raise informative error

    XMLWordPrintableJSON

Details

    Description

      Currently, when passing a URI to a filesystem method (except for from_uri) or other functions that accept a filesystem object, you can get a rather cryptic error message (eg in this case about "No response body" for S3, in the example below).

      Ideally, the filesystem object knows its own prefix "scheme", and so can detect if a user is passing a URI instead of file path, and we can provide a nicer error message.

      Example with S3:

      >>> from pyarrow.fs import S3FileSystem
      >>> fs = S3FileSystem(region="us-east-2")
      >>> fs.get_file_info('s3://ursa-labs-taxi-data/2016/01/')
      ...
      OSError: When getting information for key '/ursa-labs-taxi-data/2016/01' in bucket 's3:': AWS Error [code 100]: No response body.
      
      >>> import pyarrow.parquet as pq
      >>> table = pq.read_table('s3://ursa-labs-taxi-data/2016/01/data.parquet', filesystem=fs)
      ...
      OSError: When getting information for key '/ursa-labs-taxi-data/2016/01/data.parquet' in bucket 's3:': AWS Error [code 100]: No response body.
      

      With a local filesystem, you actually get a not found file:

      Unable to find source-code formatter for language: python. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
      >>> fs = LocalFileSystem()
      >>> fs.get_file_info("file:///home")
      <FileInfo for 'file:///home': type=FileType.NotFound>
      

      cc apitrou

      Attachments

        Issue Links

          Activity

            People

              apitrou Antoine Pitrou
              jorisvandenbossche Joris Van den Bossche
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2.5h
                  2.5h