Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-2171

Implement vectored IO in parquet file format

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • parquet-mr
    • None

    Description

      We recently added a new feature called vectored IO in Hadoop for improving read performance for seek heavy readers. Spark Jobs and others which uses parquet will greatly benefit from this api. Details can be found hereĀ 

      https://github.com/apache/hadoop/commit/e1842b2a749d79cbdc15c524515b9eda64c339d5

      https://issues.apache.org/jira/browse/HADOOP-18103

      https://issues.apache.org/jira/browse/HADOOP-11867

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mthakur Mukund Thakur
              Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated: