Details
Description
StructType looks an awful lot like a Python dictionary.
However, it doesn't implement __iter__(), so doing a quick conversion like this doesn't work:
>>> df = sqlContext.jsonRDD(sc.parallelize(['{"name": "El Magnifico"}'])) >>> df.schema StructType(List(StructField(name,StringType,true))) >>> dict(df.schema) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'StructType' object is not iterable
This would be super helpful for doing any custom schema manipulations without having to go through the whole .json() -> json.loads() -> manipulate() -> json.dumps() -> .fromJson() charade.
Same goes for Row, which offers an asDict() method but doesn't support the more Pythonic dict(Row).