Description
I was trying to translate Scala into python with PySpark 2.4.0 .Codes below aims to extract col 'list' value using col 'num' as index.
x = spark.createDataFrame([((1,2,3),1),((4,5,6),2),((7,8,9),3)],['list','num']) x.show()
list | num |
---|---|
[1,2,3] | 1 |
[4,5,6] | 2 |
[7,8,9] | 3 |
I suppose to use new func 'element_at' in 2.4.0 .But it gives an error:
x.withColumn('aa',F.element_at('list',x.num.cast('int')))
TypeError: Column is not iterable
Finally ,I have to use udf to solve this problem.
But in Scala ,it is ok when the second param 'extraction' in func 'element_at' is a col name with int type:
//Scala val y = x.withColumn("aa",element_at('list,'num.cast("int"))) y.show()
list | num | aa |
---|---|---|
[1,2,3] | 1 | 1 |
[4,5,6] | 2 | 5 |
[7,8,9] | 3 | 9 |
I hope it could be fixed in latest version.
Attachments
Issue Links
- links to