# Wrong comment in random matrix generator in spark-als algorithm

XMLWordPrintableJSON

#### Details

• Improvement
• Status: Resolved
• Minor
• Resolution: Fixed
• 3.1.1, 3.1.2, 3.2.1
• None

#### Description

In algorithm Spark ALS we need initialize nonegative factor matricies for users and items.

In ALS:

```private def initialize[ID](
inBlocks: RDD[(Int, InBlock[ID])],
rank: Int,
seed: Long): RDD[(Int, FactorBlock)] = {
// Choose a unit vector uniformly at random from the unit sphere, but from the
// "first quadrant" where all elements are nonnegative. This can be done by choosing
// elements distributed as Normal(0,1) and taking the absolute value, and then normalizing.
// This appears to create factorizations that have a slightly better reconstruction
// (<1%) compared picking elements uniformly at random in [0,1].
inBlocks.mapPartitions({ iter =>
iter.map {
case (srcBlockId, inBlock) =>
val random: XORShiftRandom = new XORShiftRandom(byteswap64(seed ^ srcBlockId))
val factors: Array[Array[Float]] = Array.fill(inBlock.srcIds.length) {
val factor = Array.fill(rank)(random.nextGaussian().toFloat)
val nrm: Float = blas.snrm2(rank, factor, 1)
blas.sscal(rank, 1.0f / nrm, factor, 1)
factor
}
(srcBlockId, factors)
}
}, preservesPartitioning = true)
} ```

In the comments, the author writes that we are generating a matrix filled with positive numbers. In the code we use random.nextGaussian().toFloat. But if we look at the documentation of the nextGaussian method, we can see that it also returns negative numbers:

```/**
* @return the next pseudorandom, Gaussian ("normally") distributed
*         {@code double} value with mean {@code 0.0} and
*         standard deviation {@code 1.0} from this random number
*         generator's sequence
*/
synchronized public double nextGaussian() {
// See Knuth, ACP, Section 3.4.1 Algorithm C.
if (haveNextNextGaussian) {
haveNextNextGaussian = false;
return nextNextGaussian;
} else {
double v1, v2, s;
do {
v1 = 2 * nextDouble() - 1; // between -1 and 1
v2 = 2 * nextDouble() - 1; // between -1 and 1
s = v1 * v1 + v2 * v2;
} while (s >= 1 || s == 0);
double multiplier = StrictMath.sqrt(-2 * StrictMath.log(s)/s);
nextNextGaussian = v2 * multiplier;
haveNextNextGaussian = true;
return v1 * multiplier;
}
}
```

The result is a matrix with negative values

#### People

Sean R. Owen
Nikolay
0 Vote for this issue
Watchers:
3 Start watching this issue

#### Dates

Created:
Updated:
Resolved:

#### Time Tracking

Estimated:
24h
Remaining:
24h
Logged:
Not Specified