Description
In many ML examples, the output is useless. Sometimes show() is called and any pertinent results are hidden. For example, here is the output of max_abs_scaler
$ bin/spark-submit examples/src/main/python/ml/max_abs_scaler_example.py +-----+--------------------+--------------------+ |label| features| scaledFeatures| +-----+--------------------+--------------------+ | 0.0|(692,[127,128,129...|(692,[127,128,129...| | 1.0|(692,[158,159,160...|(692,[158,159,160...| | 1.0|(692,[124,125,126...|(692,[124,125,126...|
Other times a few rows are printed out when show might be more appropriate. Here is the output from binarizer_example
$ bin/spark-submit examples/src/main/python/ml/binarizer_example.py 0.0 1.0 0.0
But would be much more useful to just show() the transformed DataFrame
+-----+-------+-----------------+ |label|feature|binarized_feature| +-----+-------+-----------------+ | 0| 0.1| 0.0| | 1| 0.8| 1.0| | 2| 0.2| 0.0| +-----+-------+-----------------+
Attachments
Issue Links
- is blocked by
-
SPARK-16403 Example cleanup and fix minor issues
- Resolved
- links to