Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
3.5.1
Description
When using the pyspark.sql.types.TimestampType, if your value is a datetime.datetime object with a tzinfo, this typo breaks things.
I believe this commit introduced the bug 9 months ago
Full stack trace below:
File "/databricks/spark/python/pyspark/worker.py", line 1490, in main process() File "/databricks/spark/python/pyspark/worker.py", line 1482, in process serializer.dump_stream(out_iter, outfile) File "/databricks/spark/python/pyspark/sql/pandas/serializers.py", line 531, in dump_stream return ArrowStreamSerializer.dump_stream( File "/databricks/spark/python/pyspark/sql/pandas/serializers.py", line 107, in dump_stream for batch in iterator: File "/databricks/spark/python/pyspark/sql/pandas/serializers.py", line 525, in init_stream_yield_batches batch = self._create_batch(series) File "/databricks/spark/python/pyspark/sql/pandas/serializers.py", line 511, in _create_batch arrs.append(self._create_array(s, t, arrow_cast=self._arrow_cast)) File "/databricks/spark/python/pyspark/sql/pandas/serializers.py", line 284, in _create_array series = conv(series) File "/databricks/spark/python/pyspark/sql/pandas/types.py", line 1060, in <lambda> return lambda pser: pser.apply( # type: ignore[return-value] File "/databricks/python/lib/python3.10/site-packages/pandas/core/series.py", line 4771, in apply return SeriesApply(self, func, convert_dtype, args, kwargs).apply() File "/databricks/python/lib/python3.10/site-packages/pandas/core/apply.py", line 1123, in apply return self.apply_standard() File "/databricks/python/lib/python3.10/site-packages/pandas/core/apply.py", line 1174, in apply_standard mapped = lib.map_infer( File "pandas/_libs/lib.pyx", line 2924, in pandas._libs.lib.map_infer File "/databricks/spark/python/pyspark/sql/pandas/types.py", line 1061, in <lambda> lambda x: conv(x) if x is not None else None # type: ignore[misc] File "/databricks/spark/python/pyspark/sql/pandas/types.py", line 889, in convert_array return [ File "/databricks/spark/python/pyspark/sql/pandas/types.py", line 890, in <listcomp> _element_conv(v) if v is not None else None # type: ignore[misc] File "/databricks/spark/python/pyspark/sql/pandas/types.py", line 1010, in convert_struct return { File "/databricks/spark/python/pyspark/sql/pandas/types.py", line 1011, in <dictcomp> name: conv(v) if conv is not None and v is not None else v File "/databricks/spark/python/pyspark/sql/pandas/types.py", line 1032, in convert_timestamp ts = pd.Timstamp(value) File "/databricks/python/lib/python3.10/site-packages/pandas/__init__.py", line 264, in __getattr__ raise AttributeError(f"module 'pandas' has no attribute '{name}'") AttributeError: module 'pandas' has no attribute 'Timstamp'