Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24946

PySpark - Allow np.Arrays and pd.Series in df.approxQuantile

    XMLWordPrintableJSON

    Details

      Description

      As Python user it is convenient to pass a numpy array or pandas series `approxQuantile(col, probabilities, relativeError)` for the probabilities parameter. 

       

      Especially for creating cumulative plots (say in 1% steps) it is handy to use `approxQuantile(col, np.arange(0, 1.0, 0.01), relativeError)`.

       

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              PaulGroundation Paul Westenthanner
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: