Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-4057

[Python] Revamp handling of file URIs in pyarrow.parquet

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 0.14.0
    • Component/s: Python
    • Labels:

      Description

      The way this is being handled currently is pretty brittle. If the HDFS cluster being used to run the unit tests does not support writes from $USER then the tests fail (e.g. the only permissioned user in the docker-compose cluster is "root", so the unit tests cannot be run)

      I'm inserting various hacks to get the tests passing for now, but they are temporary. There is code relating to path and URI handling spread throughout the parquet module; it would be much better to centralize and clean this up

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              wesmckinn Wes McKinney
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: