Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11608

Impala SHOW TABLE STATS shows wrong number of files for Iceberg tables

    XMLWordPrintableJSON

Details

    • ghx-label-10

    Description

      Impala SHOW TABLE stats outputs wrong value for number of files for Iceberg tables. It should only calculate the number of data files, but it calculates all files under the table directory, including metadata files, orphaned files, and old data files not belonging to the current snapshot.

      It should only output the number of data files in the current snapshot, making the output consistent with SHOW FILES IN tbl;

      create table test (i int) stored as iceberg;
      
      compute stats test;
      
      show table stats test;
      
      +-------+--------+--------+--------------+-------------------+---------+-------------------+--------------------------------------------+
      | #Rows | #Files | Size   | Bytes Cached | Cache Replication | Format  | Incremental stats | Location                                   |
      +-------+--------+--------+--------------+-------------------+---------+-------------------+--------------------------------------------+
      | -1    | 2      | 2.70KB | NOT CACHED   | NOT CACHED        | PARQUET | false             | hdfs://localhost:20500/test-warehouse/test |
      +-------+--------+--------+--------------+-------------------+---------+-------------------+--------------------------------------------+
      

      SHOW TABLE STATS is handled here: https://github.com/apache/impala/blob/66484a4c081f3242750a3a0e04159dd4580b37a4/fe/src/main/java/org/apache/impala/service/Frontend.java#L1429-L1457

      Attachments

        Activity

          People

            lipenglin Li Penglin
            boroknagyz Zoltán Borók-Nagy
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: