Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6713

[Python] Getting "ArrowIOError: Corrupted file, smaller than file footer" when reading large number of parquet files to ParquetDataset()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • None
    • None
    • None

    Description

      When trying to read a large number of parquet files (> 600) into ParquetDataset(), getting the error: 

      ArrowIOError: Corrupted file, smaller than file footer.

       

      This could be related to this issue: https://issues.apache.org/jira/browse/ARROW-3424

      Note:

      -This works fine for small number of (< 245 to be exact, not sure if this helps) parquet files.

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            harinikannan Harini Kannan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: