Details
-
Sub-task
-
Status: Reopened
-
Major
-
Resolution: Unresolved
-
2.0.0
-
None
-
None
-
None
Description
A possible performance issue may happen with the below code when using a query like this SELECT count(1) FROM parquetTable.
if (!ColumnProjectionUtils.isReadAllColumns(configuration) && !indexColumnsWanted.isEmpty()) { MessageType requestedSchemaByUser = getSchemaByIndex(tableSchema, columnNamesList, indexColumnsWanted); return new ReadContext(requestedSchemaByUser, contextMetadata); } else { return new ReadContext(tableSchema, contextMetadata); }
If there are not columns nor indexes selected, then the above code will read the full schema from Parquet even if Hive does not do anything with such values.
Attachments
Attachments
Issue Links
- is related to
-
PARQUET-363 Cannot construct empty MessageType for ReadContext.requestedSchema
- Resolved
-
PARQUET-278 enforce non empty group on MessageType level
- Resolved