Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-2180

make the default behavior for proto writing not-backwards compatible

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • parquet-protobuf
    • None

    Description

      https://issues.apache.org/jira/browse/PARQUET-968 introduced supporting maps and lists in a spec compliant way.  however, to not break existing libraries, a flag was introduced and defaulted the write behavior to NOT use the specs compliant writes.

      it's been over 5 years, and people should be really off of it.  so much so, that trying to use the new parquet-cli tool to read parquet files generated by flink doesn't work b/c it's hard coded to never allow the old style.  the deprecated parquet-tools reads these files fine b/c it's the older style.

      i started coding up a workaround in flink-parquet and parquet-cli, but stopped.  we really should just move on at this point, imho.  protobufs often have repeated primitives and maps, so it's more pressing to get proper specs compliant support for it now.  we should keep the flag around and let people override it back to being backwards compatible though.

      i have the code written and can submit a PR if you'd like.

      i'm not an expert in parquet though, so i'm unclear as to the deep downstream ramifications of this change, so i would love to get feedback in this area.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jinyius J Y
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: