Details
Description
According to __getitem__ contract:
if of a value outside the set of indexes for the sequence (after any special interpretation of negative values), IndexError should be raised.
This required for example for correct iteration over the structure.
Right now it throws ValueError what results in a quite confusing behavior when attempt to iterate over a vector results in a ValueError due to unterminated iteration:
In [1]: from pyspark.mllib.linalg import SparseVector In [2]: list(SparseVector(4, [0], [0])) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-2-147f3bb0a47d> in <module>() ----> 1 list(SparseVector(4, [0], [0])) /opt/spark-2.0/python/pyspark/mllib/linalg/__init__.py in __getitem__(self, index) 803 804 if index >= self.size or index < -self.size: --> 805 raise ValueError("Index %d out of bounds." % index) 806 if index < 0: 807 index += self.size ValueError: Index 4 out of bounds.