Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6673

Be smarter about I/O patterns for Parquet scan ranges

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Backend

    Description

      Currently the Parquet scanner is somewhat naive about how it issues column scan ranges: it issues a separate scan range per column, in the order that the the column readers are organised internally. If the column ranges are large (i.e. multiple I/O buffers) or we're reading from SSDs where random access is fairly efficient, this may not matter very much. However, this approach is suboptimal when reading smaller columns (e.g. highly compressed) from spinning disks for two reasons:

      1. Some columns may be adjacent in the file. If we are reading each column into a single smaller I/O buffer but multiple columns would fit in a larger I/O buffer, we would probably be better off doing a single I/O for that column.
      2. We are reading the columns in a fairly random order, because the I/O mgr does round robin on the scan ranges in the order they were added. Sorting the scan ranges by file offset would improve the odds of being able to read each subsequent column without an additional seek and will also improve locality for the disk's internal cache. Based on some superficial googling, a lot of drives have 64M or 128M internal caches, which is large enough that it could be useful but small enough that, if we do I/O from a 256MB+ Parquet file in a completely random order, we're reducing the chances of getting cache hits significantly.

      IMPALA-4835 may help a lot here, since it will tell us upfront what the memory budget is for I/O.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tarmstrong Tim Armstrong
              Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: