Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Currently
http://madlib.apache.org/docs/latest/group__grp__svm.html#kernel_params
says
n_components Default: 2*num_features. The dimensionality of the transformed feature space. A larger value lowers the variance of the estimate of the kernel but requires more memory and takes longer to train.
but this produces poor decision boundaries for small num_features. I suggest we change the default to
n_components Default: max(100, 2*num_features). The dimensionality of the transformed feature space. A larger value lowers the variance of the estimate of the kernel but requires more memory and takes longer to train.