Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-2646

[C++/Python] Pandas roundtrip for date objects

    XMLWordPrintableJSON

Details

    Description

      Arrow currently casts date objects to nanosecond precision datetime objects. I'd like to have a way to preserve the type during a roundtrip

      >>> import pandas as pd
      >>> import pyarrow as pa
      >>> import datetime
      >>> pa.date32().to_pandas_dtype()
      dtype('<M8[ns]')
      >>> df = pd.DataFrame({'date': [datetime.date(2018, 1, 1)]})
      >>> df.dtypes
      date object
      dtype: object
      >>> df_rountrip = pa.Table.from_pandas(df).to_pandas()
      >>> df_rountrip.dtypes
      date    datetime64[ns]
      dtype: object
      

      I'd expect something like this to work:

      >>> import pandas.testing as pdt
      >>> df_rountrip = pa.Table.from_pandas(df).to_pandas(date_as_object=True)
      >>> pdt.assert_frame_equal(df_rountrip, df)
      

      Attachments

        Issue Links

          Activity

            People

              kszucs Krisztian Szucs
              fjetter Florian Jetter
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 20m
                  2h 20m