Date and timestamp are not yet supported in DataFrame.toPandas() using ArrowConverters. These are common types for data analysis used in both Spark and Pandas and should be supported.
There is a discrepancy with the way that PySpark and Arrow store timestamps, without timezone specified, internally. PySpark takes a UTC timestamp that is adjusted to local time and Arrow is in UTC time. Hopefully there is a clean way to resolve this.
Spark internal storage spec:
- DateType stored as days
- Timestamp stored as microseconds