[ARROW-7080] [Python][Parquet][C++] Expose parquet field_id in Schema objects - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.17.0
Component/s: C++, Python
Labels:
- parquet
- pull-request-available

External issue URL:
https://github.com/apache/arrow/issues/23388

Description

I'm in the process of adding parquet read support to Iceberg(https://iceberg.apache.org/), and we use the parquet field_ids as a consistent id when reading a parquet file to create a map between the current schema and the schema of the file being read. Unless I've missed something, it appears that field_id is not exposed in the python APIs in pyarrow._parquet.ParquetSchema nor is it available in pyarrow.lib.Schema.

Would it be possible to add this to either of those two objects?

Attachments

Issue Links

links to

GitHub Pull Request #6408

Activity

People

Assignee:: Wes McKinney

Reporter:: Ted Gooch

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 06/Nov/19 18:50

Updated:: 11/Jan/23 07:51

Resolved:: 21/Feb/20 08:19

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

8h 40m