Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-8213

[Python][Dataset] Opening a dataset with a local incorrect path gives confusing error message

    XMLWordPrintableJSON

Details

    Description

      Even after the previous PRs related to local paths (https://github.com/apache/arrow/pull/6643, https://github.com/apache/arrow/pull/6655), I don't think the user experience optimal in case you are working with local files, and pass a wrong, non-existent path (eg due to a typo).

      Currently, you get this error:

      >>> dataset = ds.dataset("data_with_typo.parquet", format="parquet")
      ...
      ArrowInvalid: URI has empty scheme: 'data_with_typo.parquet'
      

      where "URI has empty scheme" is rather confusing for the user in case of a non-existent path. I think ideally we should raise a "No such file or directory" error.

      I am not fully sure what the best solution is, as FileSystem.from_uri can also give other errors that we do want to propagate to the user.
      The most straightforward that I am now thinking of is checking if "URI has empty scheme" is in the error message, and then rewording it, but that's not very clean ..

      Attachments

        Issue Links

          Activity

            People

              kszucs Krisztian Szucs
              jorisvandenbossche Joris Van den Bossche
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 40m
                  2h 40m