Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-16928

[C++] Reconsider filesystem equality

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • C++
    • None

    Description

      Filesystems support an equality method to compare filesystem instances. The original idea is that all filesystem parameters should be transparent and easily read back, so it should be possible to support equality (similarly, it was envisioned to allow roundtripping filesystems through URIs, though the filesystem-to-URI direction was never implemented).

      However, along the way, filesystems like S3 grew increasingly complex and opaque modes of configuration where equality can only be approximated. It can also be costly to compute (for example, S3Options::Equals involves fetching the actual secret key and session token, which can take some time: these mere operations consume 5 seconds in the PyArrow test suite).

      Right now, filesystem equality is merely used for testing on the Python side (to try and validate filesystem pickling).

      We should decide whether we want to continue supporting filesystem equality and, if so, what the semantics are (is approximate equality useful?).

      Attachments

        Activity

          People

            Unassigned Unassigned
            apitrou Antoine Pitrou
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: