For consistency and code reuse, QuantileDiscretizer should use approxQuantile to find splits in the data rather than implement it's own method.
Additionally, making this change should remedy a bug where QuantileDiscretizer fails to calculate the correct splits in certain circumstances, resulting in an incorrect number of buckets/bins.
val df = sc.parallelize(1.0 to 10.0 by 1.0).map(Tuple1.apply).toDF("x")
val discretizer = new QuantileDiscretizer().setInputCol("x").setOutputCol("y").setNumBuckets(5)
Array(-Infinity, 2.0, 4.0, 6.0, 8.0, 10.0, Infinity)
which corresponds to 6 buckets (not 5).