Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
I personally believe Parquet will be at the center of the analytics ecosystem
https://parquet.apache.org/ currently emphasis Parquet's role in the Hadoop ecosystem. I think this causes confusion in several ways:
1. It implies that parquet is only focused on Hadoop, whem I think it is a critical technology across other ecosystems that are unrelated to hadoop (e.g. Apache Iceberg, Delta Lake, etc)
2. It may further the perception that the Apache Parquet project only focuses on / cares about Hadoop / Java impleemntation
I would like to update the site to focus less on the hadoop aspects and more on the broader nature of Parquet
If people like where this is headed, I would like to next expand the documentation to explain better how the various implementations are related (e.g. how parquet-mr relates to the readers in arrow-rs, arrow, etc)
Attachments
Issue Links
- links to