Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 1.4.1, Impala 2.2
-
None
-
None
Description
It looks like there is a mistake in Bitmap::Get/Set operations. As it was reported by Rui Li:
When the bitmap is created, the buffer size is decided as β(num_bits + 63) >> 6β, which makes sense because the buffer is a vector of 64-bit integer. However, when we get/set bits, the word index is computed as βbit_index >> 8β. I think this will make different bit indices share the same slot, e.g. bit_index 0 and bit_index 64.
Indeed it looks wrong:
In bitmap.h: Bitmap(int64_t num_bits) { buffer_.resize(BitUtil::RoundUpNumi64(num_bits)); size_ = num_bits; } In bitutil.h: /// Returns the rounded up to 64 multiple. Used for conversions of bits to i64. static inline uint32_t RoundUpNumi64(uint32_t bits) { return (bits + 63) >> 6; } But the Set/Get in bitmap.h are shifting the wrong number of bits. E.g.: /// Sets the bit at 'bit_index' to v. If mod is true, this /// function will first mod the bit_index by the bitmap size. template<bool mod> void Set(int64_t bit_index, bool v) { if (mod) bit_index %= size(); int64_t word_index = bit_index >> 8; <== This should be 6. bit_index &= 63; DCHECK_LT(word_index, buffer_.size()); if (v) { buffer_[word_index] |= (1 << bit_index); } else { buffer_[word_index] &= ~(1 << bit_index); } }
Probably we didn't notice this error because the Bitmap is being used only by the join filters that can have false positives. When we fix it we may get a better accuracy in the filters.