Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
2.4.4
-
None
-
I'm on AWS but I presume this is happening everywhere.
Description
When you read in a "libsvm" file, it requires you to be one-based, so lines look like this:
37.0 1:1.0 2:2.75
But then when you finish something like RandomForestRegressor and look at feature importances, it is zero based.
model.stages[-1].featureImportances
SparseVector(144, {0: 0.0292, 1: 0.0041}
I guess you can add one to make them line up, but why force us to do that? Either accept zero-based lists on libsvm files (easiest) or have featureImportances output correctly.