Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
-
None
Description
In Hive Iceberg, every table has a corresponding metadata table "*.data_files" that contains info about the files that contain table's data.
select count from a data_file metadata table returns number of rows in the data table instead of number of data files from the metadata table.
CREATE TABLE x (name VARCHAR(50), age TINYINT, num_clicks BIGINT) stored by iceberg stored as orc TBLPROPERTIES ('external.table.purge'='true','format-version'='2'); insert into x values ('amy', 35, 123412344), ('adxfvy', 36, 123412534), ('amsdfyy', 37, 123417234), ('asafmy', 38, 123412534); insert into x values ('amerqwy', 39, 123441234), ('amyxzcv', 40, 123341234), ('erweramy', 45, 122341234); Select * from default.x.data_files; – Returns 2 records in the output Select count from default.x.data_files; – Returns 7 instead of 2
Attachments
Issue Links
- links to