[HIVE-17696] Vectorized reader does not seem to be pushing down projection columns in certain code paths - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.4.0, 3.0.0
Component/s: None
Labels:
None

Description

This is the code snippet from VectorizedParquetRecordReader.java

MessageType tableSchema;
    if (indexAccess) {
      List<Integer> indexSequence = new ArrayList<>();

      // Generates a sequence list of indexes
      for(int i = 0; i < columnNamesList.size(); i++) {
        indexSequence.add(i);
      }

      tableSchema = DataWritableReadSupport.getSchemaByIndex(fileSchema, columnNamesList,
        indexSequence);
    } else {
      tableSchema = DataWritableReadSupport.getSchemaByName(fileSchema, columnNamesList,
        columnTypesList);
    }

    indexColumnsWanted = ColumnProjectionUtils.getReadColumnIDs(configuration);
    if (!ColumnProjectionUtils.isReadAllColumns(configuration) && !indexColumnsWanted.isEmpty()) {
      requestedSchema =
        DataWritableReadSupport.getSchemaByIndex(tableSchema, columnNamesList, indexColumnsWanted);
    } else {
      requestedSchema = fileSchema;
    }

    this.reader = new ParquetFileReader(
      configuration, footer.getFileMetaData(), file, blocks, requestedSchema.getColumns());

Couple of things to notice here:

Most of this code is duplicated from DataWritableReadSupport.init() method.
the else condition passes in fileSchema instead of using tableSchema like we do in DataWritableReadSupport.init() method. Does this cause projection columns to be missed when we read parquet files? We should probably just reuse ReadContext returned from DataWritableReadSupport.init() method here.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-17696-branch-2.patch
01/Nov/17 01:22
7 kB
Ferdinand Xu
HIVE-17696.patch
19/Oct/17 05:26
3 kB
Ferdinand Xu
HIVE-17696.2.patch
24/Oct/17 14:29
7 kB
Ferdinand Xu

Activity

People

Assignee:: Ferdinand Xu

Reporter:: Vihang Karajgaonkar

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 04/Oct/17 20:25

Updated:: 22/May/18 23:15

Resolved:: 26/Oct/17 07:15