[ARROW-14596] [Python] parquet.read_table nested fields in columns does not work for use_legacy_dataset=False - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: In Progress
Priority: Critical
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: 11.0.0
Component/s: Python
Labels:
- pull-request-available

External issue URL:
https://github.com/apache/arrow/issues/30143

Description

Reading nested field does not work with use_legacy_dataset=False.

This works:

import pyarrow.parquet as pq
t = pq.read_table(
 source=*filename*,
 columns=['store_key', 'properties.country'], 
 use_legacy_dataset=True,
).to_pandas()

This does not work (for the same parquet file):

import pyarrow.parquet as pq

t = pq.read_table(
 source=*filename*,
 columns=['store_key', 'properties.country'], 
 use_legacy_dataset=False,
).to_pandas()

Attachments

Issue Links

is blocked by

ARROW-11259 [Python] Allow to create field reference to nested field

Resolved

is related to

ARROW-17540 [Python] Can not refer to field in a list of structs

Open

links to

GitHub Pull Request #14326

Activity

People

Assignee:: Miles Granger

Reporter:: Tom Scheffers

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 04/Nov/21 20:35

Updated:: 11/Jan/23 08:41

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

3h 10m