G1GC allocates humongous objects directly in the old generation to avoid unnecessary copies, which means that these allocations aren't garbage collected until a full GC runs. Humongous objects are objects that are 50% of the region size or more. Region size is at most 32MB (see the table for region size from heap size).
Parquet currently allocates a huge buffer for each contiguous group of column chunks, which in many cases is not garbage collected until a full GC. Adding a size limit for the allocation size should allow users to break row groups across multiple buffers so that buffers get collected when they have been read.