Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-9907

[Python] Failed to parse string into timestamp

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Duplicate
    • None
    • None
    • Python
    • None

    Description

      Hi,

      Not sure if I am missing something, but I am unable to get pyarrow to parse my datetimes that are being inferred as strings, to be timestamps.

      My strings are arriving in CSVs with this format: '2015-01-09 00:00:00.000'

      I have tried:
      ```
      convert_ops = csv.ConvertOptions(timestamp_parsers=['%Y-%m-%d %H:%M:%S.%f])
      df = csv.read_csv('path_to_csv', convert_options=convert_opts)
      print(df.schema)
      ```

      This yields no change and has my columns with these formatted timestamps still showing as strings.

      Additionally, I have tried casting as well:
      ```
      dfschema = pa.schema([
      ('date_column', pa.timestamp('ms'))
      ])
      df = csv.read_csv('path_to_csv')
      df.cast(target_schema=dfschema)
      ```

      This way yields the error: "pyarrow.lib.ArrowInvalid: Failed to parse string: 2015-01-09 00:00:00.000"

      I am using pyarrow=1.0.1 on a linux docker container.

      I tried to send an email to the users email list but it keeps returning a Mailer Daemon error.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gclarkjr5 Gary
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: