Author: Nathan Salmon <email@example.com>
Date: Wed Mar 1 13:23:41 2017 -0800
IMPALA-4675: Case-insensitive matching of Parquet fields.
The query option PARQUET_FALLBACK_SCHEMA_RESOLUTION
allows matching of Parquet fields by name instead of by
index (the default).
Parquet column names are case sensitive, but Impala treats
db/table/column/field names as case-insensitive. Today,
there is no way today to select Parquet columns with mixed
casing via SQL using the name-based field resolution policy.
This patch changes the matching of Parquet fields to be
- Modified the data files backing complextypestbl
to contain fields with mixed casing.
- Several existing tests run against this table,
including the test for name-based resolution.
- I confirmed that without this fix, the existing
name-based resolution tests fail on the modified
- I locally ran test_scanners.py and test_nested_types.py
on exhaustive with this fix.
Reviewed-by: Alex Behm <firstname.lastname@example.org>
Tested-by: Impala Public Jenkins