I'm a bit confused by this section of the design doc:
It is pretty hard to define a common interface. Because LSH algorithm has two types at least. One is to calculate hash value. The other is to calculate a similarity between a feature(vector) and another one.
For example, random projection algorithm is a type of calculating a similarity. It is designed to approximate the cosine distance between vectors. On the other hand, min hash algorithm is a type of calculating a hash value. The hash function maps a d dimensional vector onto a set of integers.
Sign-random-projection LSH does calculate a hash value (essentially a Bitset) for each feature vector, and the Hamming distance between two hash values is used to estimate the cosine similarity between the corresponding feature vectors. The two "types" of LSH mentioned here seem more like two kinds of operations which are sometimes applied sequentially. Maybe this distinction makes more sense for other types of LSH?