Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-2470

Update the website to describe the larger role of Parquet

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • parquet-site

    Description

      I personally believe Parquet will be at the center of the analytics ecosystem

      https://parquet.apache.org/ currently emphasis Parquet's role in the Hadoop ecosystem. I think this causes confusion in several ways:

      1. It implies that parquet is only focused on Hadoop, whem I think it is a critical technology across other ecosystems that are unrelated to hadoop (e.g. Apache Iceberg, Delta Lake, etc)
      2. It may further the perception that the Apache Parquet project only focuses on / cares about Hadoop / Java impleemntation

       

      I would like to update the site to focus less on the hadoop aspects and more on the broader nature of Parquet

       

      If people like where this is headed, I would like to next expand the documentation to explain better how the various implementations are related (e.g. how parquet-mr relates to the readers in arrow-rs, arrow, etc)

      Attachments

        Issue Links

          Activity

            People

              alamb Andrew Lamb
              alamb Andrew Lamb
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: