XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0
    • Examples, ML
    • None

    Description

      In many ML examples, the output is useless. Sometimes show() is called and any pertinent results are hidden. For example, here is the output of max_abs_scaler

      $ bin/spark-submit examples/src/main/python/ml/max_abs_scaler_example.py 
      +-----+--------------------+--------------------+
      |label|            features|      scaledFeatures|
      +-----+--------------------+--------------------+
      |  0.0|(692,[127,128,129...|(692,[127,128,129...|
      |  1.0|(692,[158,159,160...|(692,[158,159,160...|
      |  1.0|(692,[124,125,126...|(692,[124,125,126...|
      

      Other times a few rows are printed out when show might be more appropriate. Here is the output from binarizer_example

      $ bin/spark-submit examples/src/main/python/ml/binarizer_example.py 
      0.0                                                                             
      1.0
      0.0
      

      But would be much more useful to just show() the transformed DataFrame

      +-----+-------+-----------------+
      |label|feature|binarized_feature|
      +-----+-------+-----------------+
      |    0|    0.1|              0.0|
      |    1|    0.8|              1.0|
      |    2|    0.2|              0.0|
      +-----+-------+-----------------+
      

      Attachments

        Issue Links

          Activity

            People

              bryanc Bryan Cutler
              bryanc Bryan Cutler
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: