I was wondering how different physical types for the same logical type should be handled across multiple files. Does a table schema only define logical types for fields that can be stored using any kind of primitive type capable of representing the given logical types? If so, is an HDFS directory considered to be a valid Impala table if it contains two Parquet files with matching logical types but stored using different primitive types?
For example, given these two files:
- `test/file1.par`: single column: name = `num`, primitive type = BYTE_ARRAY, logical type = DECIMAL.
- `test/file2.par`: single column: name = `num`, primitive type = FIXED_LEN_BYTE_ARRAY, logical type = DECIMAL.
Should we be able to define a table over these two files in Impala? I think the answer is yes, but I would like to get your feedback. The alternative approach that requires Impala users to specify non-default primitive types for table columns is less flexible and seems much more complicated to implement.