Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Duplicate
-
None
-
None
-
None
Description
Hi,
Not sure if I am missing something, but I am unable to get pyarrow to parse my datetimes that are being inferred as strings, to be timestamps.
My strings are arriving in CSVs with this format: '2015-01-09 00:00:00.000'
I have tried:
```
convert_ops = csv.ConvertOptions(timestamp_parsers=['%Y-%m-%d %H:%M:%S.%f])
df = csv.read_csv('path_to_csv', convert_options=convert_opts)
print(df.schema)
```
This yields no change and has my columns with these formatted timestamps still showing as strings.
Additionally, I have tried casting as well:
```
dfschema = pa.schema([
('date_column', pa.timestamp('ms'))
])
df = csv.read_csv('path_to_csv')
df.cast(target_schema=dfschema)
```
This way yields the error: "pyarrow.lib.ArrowInvalid: Failed to parse string: 2015-01-09 00:00:00.000"
I am using pyarrow=1.0.1 on a linux docker container.
I tried to send an email to the users email list but it keeps returning a Mailer Daemon error.
Attachments
Issue Links
- is a child of
-
ARROW-15894 [C++] Strptime issues umbrella
- Open
- is duplicated by
-
ARROW-15883 [C++] Support for fractional seconds in strptime()
- Open