Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-8231

[Rust] Parse key_value_metadata from parquet FileMetaData into arrow schema metadata

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 0.17.0
    • Rust

    Description

      The parquet-format FileMetaData struct contains optional key value pairs with additional metadata about the schema:

      https://docs.rs/parquet-format/2.6.0/src/parquet_format/parquet_format.rs.html#3821

      When the parquet file was generated using the java avro parquet writer, this for example contains the original avro schema under the `parquet.avro.schema` or `avro.schema` keys.

      It would be nice if this metadata was accessible through the `arrow::datatypes::Schema.metadata` field.

      I'm willing to implement and create a pull request for this feature.

      Attachments

        Issue Links

          Activity

            People

              jhorstmann Jörn Horstmann
              jhorstmann Jörn Horstmann
              Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m