Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-1465

CLONE - Add a way to append encoded blocks in ParquetFileWriter

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.8.0
    • 1.9.0, 1.8.2
    • parquet-mr
    • None

    Description

      Concatenating two files together currently requires reading the source files and rewriting the content from scratch. This ends up taking a lot of memory, even if the data is already encoded correctly and blocks just need to be appended and have their metadata updated. Merging two files should be fast and not take much memory.

      Attachments

        Issue Links

          Activity

            People

              rdblue Ryan Blue
              spaster Steven Paster
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: