Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-1685

Truncate the stored min and max for String statistics to reduce the footer size

    XMLWordPrintableJSON

Details

    Description

      Iceberg has a cool feature that truncates the stored min, max statistics to minimize the metadata size. We can borrow to truncate them in Parquet also to reduce the size of the footer, or even the page header. Here is the code in IceBerg https://github.com/apache/incubator-iceberg/blob/master/api/src/main/java/org/apache/iceberg/util/UnicodeUtil.java

       

       

       

       

       

       

      Attachments

        Activity

          People

            shangx@uber.com Xinli Shang
            shangx@uber.com Xinli Shang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: