Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-8458

Can't set numNull/maxSize/avgSize column stats with local catalog without also setting NDV

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • Impala 3.3.0
    • Impala 3.3.0
    • Catalog
    • None

    Description

      Repro:

      [tarmstrong-box2.ca.cloudera.com:21000] default> create table test_stats2(s string);
      +-------------------------+
      | summary                 |
      +-------------------------+
      | Table has been created. |
      +-------------------------+
      Fetched 1 row(s) in 0.36s
      [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats test_stats2;
      +--------+--------+------------------+--------+----------+----------+
      | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
      +--------+--------+------------------+--------+----------+----------+
      | s      | STRING | -1               | -1     | -1       | -1       |
      +--------+--------+------------------+--------+----------+----------+
      Fetched 1 row(s) in 0.02s
      [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set column stats s('avgSize'='1234');
      +-----------------------------------------+
      | summary                                 |
      +-----------------------------------------+
      | Updated 0 partition(s) and 1 column(s). |
      +-----------------------------------------+
      Fetched 1 row(s) in 0.14s
      [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats test_stats2;
      +--------+--------+------------------+--------+----------+----------+
      | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
      +--------+--------+------------------+--------+----------+----------+
      | s      | STRING | -1               | -1     | -1       | -1       |
      +--------+--------+------------------+--------+----------+----------+
      Fetched 1 row(s) in 0.02s
      [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set column stats s('maxSize'='1234');
      +-----------------------------------------+
      | summary                                 |
      +-----------------------------------------+
      | Updated 0 partition(s) and 1 column(s). |
      +-----------------------------------------+
      Fetched 1 row(s) in 0.10s
      [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats test_stats2;
      +--------+--------+------------------+--------+----------+----------+
      | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
      +--------+--------+------------------+--------+----------+----------+
      | s      | STRING | -1               | -1     | -1       | -1       |
      +--------+--------+------------------+--------+----------+----------+
      Fetched 1 row(s) in 0.02s
      [tarmstrong-box2.ca.cloudera.com:21000] default> invalidate metadata test_stats2;
      Fetched 0 row(s) in 0.03s
      [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats test_stats2;
      Query: show column stats test_stats2
      +--------+--------+------------------+--------+----------+----------+
      | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
      +--------+--------+------------------+--------+----------+----------+
      | s      | STRING | -1               | -1     | -1       | -1       |
      +--------+--------+------------------+--------+----------+----------+
      Fetched 1 row(s) in 0.07s
      

      I expected that the updates would take effect. Weirdly it doesn't happen for NDV and NULLS:

      [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set column stats s('numDVs'='1234','numNulls'='12345');
      Query: alter table test_stats2 set column stats s('numDVs'='1234','numNulls'='12345')
      +-----------------------------------------+
      | summary                                 |
      +-----------------------------------------+
      | Updated 0 partition(s) and 1 column(s). |
      +-----------------------------------------+
      Fetched 1 row(s) in 0.12s
      [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats test_stats2;
      Query: show column stats test_stats2
      +--------+--------+------------------+--------+----------+----------+
      | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
      +--------+--------+------------------+--------+----------+----------+
      | s      | STRING | 1234             | 12345  | -1       | -1       |
      +--------+--------+------------------+--------+----------+----------+
      Fetched 1 row(s) in 0.02s
      

      Attachments

        Activity

          People

            tlipcon Todd Lipcon
            tarmstrong Tim Armstrong
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: