FeatureHasher added in
SPARK-13964 always treats numeric type columns as numbers and never as categorical features. It is quite common to have categorical features represented as numbers or codes (often say Int) in data sources.
In order to hash these features as categorical, users must first explicitly convert them to strings which is cumbersome.
Add a new param categoricalCols which specifies the numeric columns that should be treated as categorical features.
Note while the reverse case is certainly possible (i.e. numeric features that are encoded as strings and a user would like to treat them as numeric), this is probably less likely and this case won't be supported at this time.