Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4835 HDFS scans should operate with a constrained number of I/O buffers
  3. IMPALA-6680

Consider reserving max 2 I/O buffers per Parquet column when there are multiple columns

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Later
    • Not Applicable
    • None
    • Backend

    Description

      Reserving 3 * 8MB per column to get triple buffering is minimally useful except in the case when you're scanning a single very large column. 2 * 8MB is enough to overlap compute and I/O, and if you have multiple large columns that will mean that there are multiple I/Os in flight in almost all cases.

      Attachments

        Activity

          People

            tarmstrong Tim Armstrong
            tarmstrong Tim Armstrong
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: