Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17893

[Python] Bug: Wrong reading of timedelta

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 8.0.0
    • 11.0.0
    • Python
    • macOS 12.6 on an Apple M1 Ultra

    Description

      When there is a timedelta and a list of dictionary that also has timedelta as well, reading the upper timedelta in feather format sometimes gives wrong reading.

      below is an example if you check the printed results sometime it reads the upper timedelta as 0 days 03:40:23 correct, and sometimes as 153 days 01:03:20 wrong

      Here is the code, also it is attached as check_timedelta.py

       

      from datetime import datetime, timedelta
      import pandas as pd
      import pyarrow.feather as feather
      time_1 = datetime.fromisoformat("2022-04-21T10:18:12+03:00")
      time_2 = datetime.fromisoformat("2022-04-21T13:58:35+03:00")
      data = [
          {
              "waiting_time": timedelta(seconds=12, microseconds=1),
          },
          {
              "waiting_time": timedelta(seconds=1020),
          },
          {
              "waiting_time": timedelta(seconds=960),
          },
          {
              "waiting_time": timedelta(seconds=960),
          },
          {
              "waiting_time": timedelta(seconds=960),
          },
          {
              "waiting_time": timedelta(seconds=815, microseconds=1),
          },
      ]
      df = pd.DataFrame(
          [
              {
                  "time_1": time_1,
                  "time_2": time_2,
                  "data": data,
                  "timedelta_1": time_2 - time_1,
                  "timedelta_2": timedelta(hours=3, minutes=40, seconds=23),
              },
          ]
      )
      
      print("Correct timedelta_1: ", df["timedelta_1"].item())
      print("Correct timedelta_2: ", df["timedelta_2"].item())
      
      with open(f"records.feather.lz4", "wb") as f:
          feather.write_feather(df, f, compression="lz4")
      
      for _ in range(10):
          with open(f"records.feather.lz4", "rb") as f:
              print("Reading timedelta_1: ", feather.read_feather(f)["timedelta_1"].item())
              print("Reading timedelta_2: ", feather.read_feather(f)["timedelta_2"].item())
      

       

       

      Printed Results

       

      Correct timedelta_1:  0 days 03:40:23
      Correct timedelta_2:  0 days 03:40:23
      Reading timedelta_1:  0 days 03:40:23
      Reading timedelta_2:  0 days 03:40:23
      Reading timedelta_1:  0 days 03:40:23
      Reading timedelta_2:  0 days 03:40:23
      Reading timedelta_1:  153 days 01:03:20
      Reading timedelta_2:  153 days 01:03:20
      Reading timedelta_1:  0 days 03:40:23
      Reading timedelta_2:  0 days 03:40:23
      Reading timedelta_1:  0 days 03:40:23
      Reading timedelta_2:  0 days 03:40:23
      Reading timedelta_1:  0 days 03:40:23
      Reading timedelta_2:  153 days 01:03:20
      Reading timedelta_1:  153 days 01:03:20
      Reading timedelta_2:  0 days 03:40:23
      Reading timedelta_1:  0 days 03:40:23
      Reading timedelta_2:  153 days 01:03:20
      Reading timedelta_1:  153 days 01:03:20
      Reading timedelta_2:  153 days 01:03:20
      Reading timedelta_1:  153 days 01:03:20
      Reading timedelta_2:  153 days 01:03:20

       

       

      Attachments

        1. check_timedelta.py
          1 kB
          Yaser Alraddadi

        Issue Links

          Activity

            People

              alenka Alenka Frim
              gam.phon Yaser Alraddadi
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m