Description
In 1.3.1 this worked:
df = sqlContext.createDataFrame([[1]], schema=['col']) df.selectExpr('null as newCol').collect()
In 1.4.0 it fails with the following stacktrace:
Traceback (most recent call last): File "<input>", line 1, in <module> File "/opt/boxen/homebrew/opt/apache-spark/libexec/python/pyspark/sql/dataframe.py", line 316, in collect cls = _create_cls(self.schema) File "/opt/boxen/homebrew/opt/apache-spark/libexec/python/pyspark/sql/dataframe.py", line 229, in schema self._schema = _parse_datatype_json_string(self._jdf.schema().json()) File "/opt/boxen/homebrew/opt/apache-spark/libexec/python/pyspark/sql/types.py", line 519, in _parse_datatype_json_string return _parse_datatype_json_value(json.loads(json_string)) File "/opt/boxen/homebrew/opt/apache-spark/libexec/python/pyspark/sql/types.py", line 539, in _parse_datatype_json_value return _all_complex_types[tpe].fromJson(json_value) File "/opt/boxen/homebrew/opt/apache-spark/libexec/python/pyspark/sql/types.py", line 386, in fromJson return StructType([StructField.fromJson(f) for f in json["fields"]]) File "/opt/boxen/homebrew/opt/apache-spark/libexec/python/pyspark/sql/types.py", line 347, in fromJson _parse_datatype_json_value(json["type"]), File "/opt/boxen/homebrew/opt/apache-spark/libexec/python/pyspark/sql/types.py", line 535, in _parse_datatype_json_value raise ValueError("Could not parse datatype: %s" % json_value) ValueError: Could not parse datatype: null
https://github.com/apache/spark/blob/v1.4.0/python/pyspark/sql/types.py#L461
The cause:_atomic_types doesn't contain NullType
Attachments
Issue Links
- links to