Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-4311

[Python] Regression on pq.ParquetWriter incorrectly handling source string

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 0.12.0
    • 0.13.0
    • Python
    • None

    Description

      In the latest changes to filesystem.py some new functions have been added to check the source string when calling pq.ParquetWriter. With the current implementation some assumptions are done about the format of the string which means that if the string is provided following some of these patterns it will be automatically split/formatted and changed to something else.

      To give you a specific example, if I provide a string like directory/level1#level2.parquet it will be written to disk as directory/level1. The behaviour has changed on 0.12.0 from 0.11.1 and nothing is stated in the documentation.

      Attachments

        Issue Links

          Activity

            People

              apitrou Antoine Pitrou
              FJ_Sanchez Francisco Sanchez
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: