Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24946

PySpark - Allow np.Arrays and pd.Series in df.approxQuantile

    XMLWordPrintableJSON

Details

    Description

      As Python user it is convenient to pass a numpy array or pandas series `approxQuantile(col, probabilities, relativeError)` for the probabilities parameter. 

       

      Especially for creating cumulative plots (say in 1% steps) it is handy to use `approxQuantile(col, np.arange(0, 1.0, 0.01), relativeError)`.

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            PaulGroundation Paul Westenthanner
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: