Details
-
Bug
-
Status: To Do
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
val df = repartition(preprocess(inputDF), inputDF.sparkSession.sparkContext.defaultParallelism) if (inputDF.isStreaming) throw new IllegalStateException("AnnoyIndexBuildSink can not be run as streaming.") else { ALSModelProcess.buildAnnoyIndex(conf, inputDF) }
In "write" method of AnnoyIndexBuildSink class, variable "df" is never used.
So, repartition does not working .
I think that second parameter of buildAnnoyIndex shoule be "df", not "inputDF".