Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5156

[Python] `df.to_parquet('s3://...', partition_cols=...)` fails with `'NoneType' object has no attribute '_isfilestore'`

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Done
    • 0.12.1
    • None
    • Python
    • Mac, Linux

    Description

      According to https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#partitioning-parquet-files, writing a parquet to S3 with `partition_cols` should work, but it fails for me. Example script:

      import pandas as pd
      import sys
      
      print(sys.version)
      print(pd._version_)
      df = pd.DataFrame([{'a': 1, 'b': 2}])
      
      df.to_parquet('s3://my_s3_bucket/x.parquet', engine='pyarrow')
      print('OK 1')
      
      df.to_parquet('s3://my_s3_bucket/x2.parquet', partition_cols=['a'], engine='pyarrow')
      print('OK 2')
      

      Output:

      3.5.2 (default, Feb 14 2019, 01:46:27)
      [GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)]
      0.24.2
      OK 1
      Traceback (most recent call last):
      File "./t.py", line 14, in <module>
      df.to_parquet('s3://my_s3_bucket/x2.parquet', partition_cols=['a'], engine='pyarrow')
      File "/Users/vshih/.pyenv/versions/3.5.2/lib/python3.5/site-packages/pandas/core/frame.py", line 2203, in to_parquet
      partition_cols=partition_cols, **kwargs)
      File "/Users/vshih/.pyenv/versions/3.5.2/lib/python3.5/site-packages/pandas/io/parquet.py", line 252, in to_parquet
      partition_cols=partition_cols, **kwargs)
      File "/Users/vshih/.pyenv/versions/3.5.2/lib/python3.5/site-packages/pandas/io/parquet.py", line 118, in write
      partition_cols=partition_cols, **kwargs)
      File "/Users/vshih/.pyenv/versions/3.5.2/lib/python3.5/site-packages/pyarrow/parquet.py", line 1227, in write_to_dataset
      _mkdir_if_not_exists(fs, root_path)
      File "/Users/vshih/.pyenv/versions/3.5.2/lib/python3.5/site-packages/pyarrow/parquet.py", line 1182, in _mkdir_if_not_exists
      if fs._isfilestore() and not fs.exists(path):
      AttributeError: 'NoneType' object has no attribute '_isfilestore'
      

       

      Original issue - https://github.com/apache/arrow/issues/4030

      Attachments

        Activity

          People

            Unassigned Unassigned
            vshih Victor Shih
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: