Description
Currently, Spark's levenshtein(str1, str2) function can be very inefficient for long strings. Many other databases which support this type of built-in function also take a third argument which signifies a maximum distance after which it is okay to terminate the algorithm.
For example something like
levenshtein(str1, str2[, max_distance])
the function stops computing the distant once the max values is reached.
See postgresql for an example of a 3 argument levenshtein.