Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-17

Parquet OutputFormat should allow controlling the file size

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      To generate the most efficient on disk file, the size of the file is important to control. It would be nice if we could configure the ouputformat to roll over new files when it reaches the right size and start a new file.

      There's currently no easy way to tune this and requires indirect tuning (number of reduces, map input size).

      Attachments

        Activity

          People

            Unassigned Unassigned
            nongli Nong Li
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: