Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
from pyarrow import json import pyarrow.parquet as pq r = json.read_json('example.jl') pq.write_table(r, 'example.parquet')
Doing the above operation resulting in ArrowInvalid: Nested column branch had multiple children
Posting it here as per the request from https://github.com/apache/arrow/issues/4045#issuecomment-535867640
The sample schema looks like this
package_version: string source_version: string uuid: string _type: string position: struct<ais_type: string, course: double, draught: double, draught_raw: null, heading: double, lat: double, lon: double, nav_state: int64, received_time: timestamp[s], speed: double> child 0, ais_type: string child 1, course: double child 2, draught: double child 3, draught_raw: null child 4, heading: double child 5, lat: double child 6, lon: double child 7, nav_state: int64 child 8, received_time: timestamp[s] child 9, speed: double provider_name: string vessel: struct<beam: null, build_year: null, call_sign: string, dead_weight: null, dwt: null, flag_code: null, flag_name: string, gross_tonnage: null, imo: string, length: null, mmsi: string, name: string, type: null, vessel_type: string> child 0, beam: null child 1, build_year: null child 2, call_sign: string child 3, dead_weight: null child 4, dwt: null child 5, flag_code: null child 6, flag_name: string child 7, gross_tonnage: null child 8, imo: string child 9, length: null child 10, mmsi: string child 11, name: string child 12, type: null child 13, vessel_type: string source_provider: string
Attachments
Attachments
Issue Links
- is duplicated by
-
ARROW-1644 [C++][Parquet] Read and write nested Parquet data with a mix of struct and list nesting levels
- Resolved