Description
This is to track adding the remaining type support in Arrow Converters. Currently, only primitive data types are supported. '
Remaining types:
DateTimestamp- Complex:
Struct,Array,Map DecimalBinaryCategoricalwhen converting from Pandas
Some things to do before closing this out:
Look to upgrading to Arrow 0.7 for better Decimal support (can now write values as BigDecimal)Need to add some user docsMake sure Python tests are thorough- Check into complex type support mentioned in comments by leif, should we support mulit-indexing?
Attachments
Issue Links
- is part of
-
SPARK-22216 Improving PySpark/Pandas interoperability
- Resolved
- is related to
-
SPARK-23836 Support returning StructType to the level support in GroupedMap Arrow's "scalar" UDFS (or similar)
- Resolved
-
SPARK-27834 Make separate PySpark/SparkR vectorization configurations
- Resolved
-
SPARK-26759 Arrow optimization in SparkR's interoperability
- Resolved
- relates to
-
SPARK-32285 Add PySpark support for nested timestamps with arrow
- In Progress
-
SPARK-33489 Support null for conversion from and to Arrow type
- Resolved