Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
-
None
Description
This improvement allows to prune the words with high document frequencies from the tf and tf-idf vectors produced by seq2sparse, based on the standard deviation of the words' document frequencies and specifying which rods to be pruned in a means of times this standard deviation. One good option is 3 times the standard deviation