Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-3646

[Python] Add convenience factories to create IO streams

    XMLWordPrintableJSON

Details

    Description

      Currently, creating IO streams requires invoking various constructors with irregular names. It would be nice to expose a higher-level interface.

      For example:

      def open_reader(source, compression='detect'):
          """
          Create an Arrow input stream.
      
          Parameters
          ----------
          source: str, Path, buffer, file-like object, ...
              The source to open for reading
          compression: str or None
              The compression algorithm to use for on-the-fly decompression.
              If 'detect' and source is a file path, then compression will be
              chosen based on the file extension.
              If None, no compression will be applied.
              Otherwise, a well-known algorithm name must be supplied (e.g. "gzip")
          """
      
      def open_writer(source, compression='detect'):
          """
          Create an Arrow output stream.
      
          Parameters
          ----------
          source: str, Path, buffer, file-like object, ...
              The source to open for writing
          compression: str or None
              The compression algorithm to use for on-the-fly compression.
              If 'detect' and source is a file path, then compression will be
              chosen based on the file extension.
              If None, no compression will be applied.
              Otherwise, a well-known algorithm name must be supplied (e.g. "gzip")
          """
      

      Thoughts?

      Attachments

        Issue Links

          Activity

            People

              apitrou Antoine Pitrou
              apitrou Antoine Pitrou
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h 20m
                  3h 20m