rownum estimation is based on the following fact as of now:
- datasize being used from the following sources:
- basicstats aggregates the loaded "on-heap" row sizes ; other readers are able to give "raw size" estimation - I've checked orc; but I'm sure others will do the same....api docs are a bit vague about the methods purpose...
- if the basicstats level info is not available; the filesystem level "file-size-sums" are used as the "raw data size" ; which is multiplied by the deserialization ratio ; which is currently 1.
the problem with all of this is that deser factor is 1; and that rowsize counts in the online object headers..
example; 20 rows are loaded into a partition columnstats_partlvl_dp.q