Details
-
New Feature
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
Description
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression.fit
https://scikit-learn.org/dev/glossary.html#term-class-weight
Introduce "-class_weight=[0.1,0.2]" or "-pos_weight=0.2 -neg_weight=0.1" option.
class_weight is computed in scikit as follows:
> class_weight_y = #samples / (#classes * count_of)
In SQL, it can be computed in SQL as follows:
-- For binary classification (#classes = 2) WITH weights as ( select count(1) / 2 * sum(if(label=0, 1, 0) as neg_weight, count(1) / 2 * sum(if(label=1, 1, 0) as pos_weight from train ) select train_classifier(features, label, concat('-pos_weight=', pos_weight, ' -neg_weight=', neg_weight) from train l cross join weights r