Description
As the first improvements to current LSH Implementations, we are planning to do the followings:
- Change output schema to Array of Vector instead of Vectors
- Use numHashTables as the dimension of Array and numHashFunctions as the dimension of Vector
- Rename RandomProjection to BucketedRandomProjectionLSH, MinHash to MinHashLSH
- Make randUnitVectors/randCoefficients private
- Make Multi-Probe NN Search and hashDistance private for future discussion
Attachments
Issue Links
- Is contained by
-
SPARK-18392 LSH API, algorithm, and documentation follow-ups
- Resolved
- links to