Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-1425

[Python] Document semantic differences between Spark timestamps and Arrow timestamps

Details

    Description

      The way that Spark treats non-timezone-aware timestamps as session local can be problematic when using pyarrow which may view the data coming from toPandas() as time zone naive (but with fields as though it were UTC, not session local). We should document carefully how to properly handle the data coming from Spark to avoid problems.

      cc Bryan Cutler Holden Karau

      Attachments

        Activity

          People

            emkornfield@gmail.com Micah Kornfield
            wesm Wes McKinney
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 6h
                6h

                Slack

                  Issue deployment