Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.11.0
-
None
-
None
Description
ROW_ID, which is a struct that represents a unique row ID within a partition of a full CRUD transactional table is currently modeled as a VirtualColumn. Acid metadata columns from which ROW_ID is built are actually stored in the data file.
There is no end to special handling of acid metadata columns in the code to make this work.
Perhaps a better approach is to add struct column to an acid table at creation time and make it a 1st class citizen visible in the metastore. 'select count ....' would need special handling to remove it. There may need to be a way to make these columns read-only.
For data added via Load Data, Add Partition, etc (i.e. original files in a CRUD table), acid reader would have fill in the values as it does today.
This would make schema evolution, PPD, projection pruning work seamlessly.
This should also make adding formats other than ORC in full CRUD tables easy.
This will likely be painful but should be investigated.
Attachments
Issue Links
- is related to
-
HIVE-20332 Materialized views: Introduce heuristic on selectivity over ROW__ID to favour incremental rebuild
- Resolved