Similar function as https://numpy.org/doc/stable/reference/generated/numpy.quantile.html
Support chunked array, and calculate multiple quantiles at once.
Possibly implement features in steps:
- implement exact quantile kernel which records all chunks and partition at finalize
- reduce memory footprint for integer inputs by maintaining "value:count" histogram
- implement approximate quantile kernel without storing input values