Affects Version/s: 2.4.1
Fix Version/s: None
I have faced an issue while casting the datatype of a column in pyspark 2.4.1.
Say that i have the following data frame in which column B is a string which has a list or arrays, and I want to convert the column B to a Arraytype, so i have used the following code
The new column C is an ArrayType of ArrayType with elements of DoubleType. But with this code I was not able to convert the integer type value 11. This value is not part of the final output.
|row1||[[12.46575,13.78697],[10.565,*11*]]||[[12.46575, 13.78697], [10.565,]]|
|row2||[[1.2345,13.45454],[6.6868,0.234524]]||[[1.2345, 13.45454], [6.6868, 0.234524]]|
As you could see, the column C does not have 11. If I replace the DoubleType to FloatType same error and if I replace it with DecimalType the output is all empty.
I am not sure whether there is a issue with my code or it is a bug.
Hope, someone can provide some clarification on this. Thanks!!