Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-7080

[Python][Parquet][C++] Expose parquet field_id in Schema objects

    XMLWordPrintableJSON

Details

    Description

      I'm in the process of adding parquet read support to Iceberg(https://iceberg.apache.org/), and we use the parquet field_ids as a consistent id when reading a parquet file to create a map between the current schema and the schema of the file being read.  Unless I've missed something, it appears that field_id is not exposed in the python APIs in pyarrow._parquet.ParquetSchema nor is it available in pyarrow.lib.Schema.

      Would it be possible to add this to either of those two objects?

      Attachments

        Issue Links

          Activity

            People

              wesm Wes McKinney
              TGooch44 Ted Gooch
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 8h 40m
                  8h 40m