Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18408

API Improvements for LSH

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0
    • ML
    • None

    Description

      As the first improvements to current LSH Implementations, we are planning to do the followings:

      • Change output schema to Array of Vector instead of Vectors
      • Use numHashTables as the dimension of Array and numHashFunctions as the dimension of Vector
      • Rename RandomProjection to BucketedRandomProjectionLSH, MinHash to MinHashLSH
      • Make randUnitVectors/randCoefficients private
      • Make Multi-Probe NN Search and hashDistance private for future discussion

      Attachments

        Issue Links

          Activity

            People

              yunn Yun Ni
              yunn Yun Ni
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: