[SPARK-40238] support scaleUpFactor and initialNumPartition in pyspark rdd API - ASF JIRA

XML

Word

Printable

JSON

`scaleUpFactor` and `initialNumPartition` config are not supported yet in pyspark rdd take API

basically it hardcoded `scaleUpFactor` as 1 and `initialNumPartition` as 4, therefore pyspark rdd take API is inconsistent with scala API.

Anyone familiar with pyspark can help support this (referring to scala implementation)?

split from

SPARK-40211 Allow executeTake() / collectLimit's number of starting partitions to be customized

links to

[Github] Pull Request #40009 (wayneguow)