[SPARK-29240] PySpark 2.4 about sql function 'element_at' param 'extraction' - ASF JIRA

XML

Word

Printable

JSON

I was trying to translate Scala into python with PySpark 2.4.0 .Codes below aims to extract col 'list' value using col 'num' as index.

x = spark.createDataFrame([((1,2,3),1),((4,5,6),2),((7,8,9),3)],['list','num'])
x.show()

I suppose to use new func 'element_at' in 2.4.0 .But it gives an error:

x.withColumn('aa',F.element_at('list',x.num.cast('int')))

TypeError: Column is not iterable

Finally ,I have to use udf to solve this problem.

But in Scala ,it is ok when the second param 'extraction' in func 'element_at' is a col name with int type:

//Scala
val y = x.withColumn("aa",element_at('list,'num.cast("int")))
y.show()

I hope it could be fixed in latest version.

links to

GitHub Pull Request #25950

Estimated:

336h

Remaining:

336h

Logged:

Not Specified