Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-1830

[Python] Error when loading all the files in a dictionary

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.1
    • Fix Version/s: 0.8.0
    • Component/s: Python
    • Environment:
      Python 2.7.11 (default, Jan 22 2016, 08:29:18) + pyarrow 0.7.1

      Description

      I can read one parquet file, but when I tried to read all the parquet files in a folder, I got an error.

      >>> data = pq.ParquetDataset('./aaa/part-00000-d8268e3a-4e65-41a3-a43e-01e0bf68ee86')
      >>> data = pq.ParquetDataset('./aaa/')
      Ignoring path: ./aaa//part-00000-d8268e3a-4e65-41a3-a43e-01e0bf68ee86
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "/usr/local/lib/python2.7/site-packages/pyarrow/parquet.py", line 638, in __init__
          self.validate_schemas()
        File "/usr/local/lib/python2.7/site-packages/pyarrow/parquet.py", line 647, in validate_schemas
          self.schema = self.pieces[0].get_metadata(open_file).schema
      IndexError: list index out of range
      >>> 
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                wesmckinn Wes McKinney
                Reporter:
                dbtsai DB Tsai
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: