I have an ORC table which is giving different figures between select count( * ) and select *:
At first I thought this was obvious just "analyze table ... compute statistics" and it'll correct itself, however I've tried that as well as adding "for columns" but the results remain the same. The select count( * ) is very fast so it must be using the pre-computed stats.
When I transform the table to text or to another orc table the count star on that new tables returns the correct number.
I've even tried disabling stats, CBO, the works, restart, same result, with very fast return each time for select count( * ), indicating it's using either pre-computed stats stored in Metastore or ORC stats in file format, but I'm not sure how ORC could store the wrong count, especially as doing a CTAS to another ORC table returns the correct count when I select count( * ) that new ORC table.