Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
1.6.0
-
None
-
None
Description
When using Row constructor from kwargs, fields in the tuple underneath are sorted by name. When Schema is reading the row, it is not using the fields in this order.
from pyspark.sql import Row from pyspark.sql.types import * schema = StructType([ StructField("id", StringType()), StructField("first_name", StringType())]) row = Row(id="39", first_name="Szymon") schema.toInternal(row) Out[5]: ('Szymon', '39')
df = sqlContext.createDataFrame([row], schema) df.show(1) +------+----------+ | id|first_name| +------+----------+ |Szymon| 39| +------+----------+
Attachments
Issue Links
- duplicates
-
SPARK-12467 Get rid of sorting in Row's constructor in pyspark
- Resolved