Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-13972

[python] read csv with different number of columns per row

Add voteWatch issue
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 5.0.0
    • None
    • Python

    Description

      When tried to read CSV data with multiple columns per row, arrows fails with an error message like below. When tried to read the CSV using other libs such as spark and pandas, they are filling up the remaining columns with null values. Is it possible to introduce such feature in pyarrow, CSV may or may not contain headers.

      Expected 952 columns, got 620:

      Attachments

        Activity

          People

            Unassigned Unassigned
            Chaitanya Chaganti Sai Krishna Chaitanya Chaganti

            Dates

              Created:
              Updated:

              Slack

                Issue deployment