Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Currently ColumnMetaData only contains bloom_filter_offset, which points to BloomFilterHeader followed by the bloom filter data.
This solution is not optimal during reading, as two IO reads are needed once we know bloom_filter_offset - one to read the header, which contains the size of the bloom filter, then another to read the actual bloom filter to a buffer. Having the size near bloom_filter_offset would allow to do this in a single read.
Having algorithm/hash/compression could be also useful by allowing skipping the read of the bloom filter if one of those parameters is not supported.